Metadata types with Scala 3
Strongly-typed programming languages allow us to avoid programming errors by lifting important information about data into the type system. By doing this, we can use the compiler to check our programs, and the stronger your types, the more your compiler can aid you in checking the correctness of your program.
Using types as metadata was possible with Scala 2, but it wasn't as capable of what we're able to do with Scala 3's improvements to tuples and the introduction of match types.
Metadata types in Scala 2
In Scala 2, you'd frequently use intersections for metadata types. For example, if you had a set, and wanted to indicate what types of data were in it, you could use intersections:
import scala.reflect.ClassTag
sealed trait A
sealed trait B
sealed trait C
class Set[T](contents: Map[Class[_], Any]) {
def get[U](implicit ev1: T <:< U, ev2: ClassTag[U]) = contents(ev2.runtimeClass)
def put[U](u: U)(implicit ev1: ClassTag[U]) = new Set[T with U](contents.updated(ev1.runtimeClass, u))
}
object Set {
val empty = new Set[Any](Map.empty)
}
val test = Set.empty.put(new A{}).put(new B{})
test.get[C]// - Causes a compile-time failure
test.get[A]
As can be seen, the metadata type here is an intersection type with the types added to the Set
being added to the set's signature, and retrieval being dependent on proving that the type being requested has been added to the set.
There are a number of weaknesses to this approach to metadata types:
- Preventing the fetching of non-existent types is frustrating
- You cannot remove metadata easily from types built up this way
Preventing the fetching of non-existent types
In the above example, the type of test
is Set[A with B]
. Under the encoding shown above, you can write test.get[A with B]
, and despite this type not existing in the set, a compile time error will not be yielded.
In order to avoid this issue, we must create a wrapper type for our metadata:
import scala.reflect.ClassTag
sealed trait Has[T]
sealed trait A
sealed trait B
sealed trait C
class Set[T](contents: Map[Class[_], Any]) {
def get[U](implicit ev1: T <:< Has[U], ev2: ClassTag[U]) = contents(ev2.runtimeClass)
def put[U](u: U)(implicit ev1: ClassTag[U]) = new Set[T with Has[U]](contents.updated(ev1.runtimeClass, u))
}
object Set {
val empty = new Set[Any](Map.empty)
}
val test = Set.empty.put(new A{}).put(new B{})
test.get[C]// - Causes a compile-time failure
test.get[A]
This type Has
doesn't add much, makes the metadata for Set
more cluttered to view and write, but is absolutely necessary to guard against the summoning of non-existent types.
Removal of data from the Set
Compared to our put
and get
methods, our method for removal of data ends up ugly and very unfriendly to our users.
import scala.reflect.ClassTag
sealed trait Has[T]
sealed trait Not[T]
implicit def notAll[T]: Not[T] = new Not[T] {}
implicit def uhoh[T](implicit ev: T): Not[T] = new Not[T] {}
sealed trait A
sealed trait B
sealed trait C
class Set[T](contents: Map[Class[_], Any]) {
def get[U](implicit ev1: T <:< Has[U], ev2: ClassTag[U]) = contents(ev2.runtimeClass)
def put[U](u: U)(implicit ev1: ClassTag[U]) = new Set[T with Has[U]](contents.updated(ev1.runtimeClass, u))
def remove[U, NT](implicit ev1: T <:< Has[U], ev2: T <:< NT, ev3: ClassTag[U], ev4: Not[NT <:< Has[U]]): Set[NT] = new Set(contents.removed(ev3.runtimeClass))
}
object Set {
val empty = new Set[Any](Map.empty)
}
val test = Set.empty.put(new A{}).put(new B{})
test.get[C]// - Causes a compile-time failure
test.remove[B, Has[A]].get[A]
test.remove[B, Has[B]] //- Causes a cryptic compile-time failure
test.remove[B, Any] //this miscalculation is not caught
As you can see, our remove method requires the user to calculate the result of removal of the type from the metadata type. As our metadata type grows in complexity, this removal calculation becomes more burdensome on the user, and more likely to be erroneous. While we've added guards to make sure the calculated type doesn't include the removed type, and that the calculated type is a subtype of the original metadata type, we cannot guard against a calculation that says more is deleted than actually was, as can be seen in the last example of the above code.
These weaknesses are observable in ZIO and ZIO2, whose environment type allows removal by provideLayer
, but which requires the user to calculate the result type after the removal.
Metadata types in Scala 3
Scala 3 provides a lot of new features with regards to types, type calculation, and metaprogramming. It also enhances the Tuple type quite a bit, which means that our choice of how to metadata types shifts; metadata types should be tuples.
import scala.compiletime.{error, constValue}
import scala.reflect.ClassTag
type Contains[T <: Tuple, U] <: Boolean = T match
case U *: r => true
case ? *: r => Contains[r, U]
case EmptyTuple => false
type Remove[T <: Tuple, U] <: Tuple = T match
case U *: r => Remove[r, U]
case t *: r => t *: Remove[r, U]
case EmptyTuple => EmptyTuple
type Add[T <: Tuple, U] = U *: Remove[T, U]
class MSet[T <: Tuple](content: Map[Class[?], Any]):
def put[U](u: U)(using ev:ClassTag[U]): MSet[Add[T, U]] = MSet(content.updated(ev.runtimeClass,u))
inline def get[U](using ev: ClassTag[U]) =
inline if constValue[Contains[T, U]] then
content(ev.runtimeClass)
else
error("This set does not contain the requested type")
def remove[U](using ev: ClassTag[U]) = MSet[Remove[T,U]](content.removed(ev.runtimeClass))
object MSet:
val empty = MSet[EmptyTuple](Map.empty)
final class A
final class B
final class C
MSet.empty.put(A()).put(B()).remove[C].get[B] //compiles
MSet.empty.put(A()).put(B()).remove[C].get[C] //fails to compile
Using match types, we can deconstruct the tuple that acts as MSet's metadata, allowing us to easily test if data exists in the tuple. Removal of metadata is also automatically calculated instead of requiring the user to do the calculation, and the get function produces a nice error message if a type doesn't exist in the set. Finally, and best of all, the set's signature is relatively clean: MSet[(A,B)]
contains an A
and a B
. A weakness of this new approach is related to how match types work; if a type is not provably disjoint from the selector, then you can be unable to reach other cases. Effectively, this means that with non-final types, you risk getting an uncalculateable state for the set.
Still, this approach is stronger in most all cases than what Scala 2 ever provided.
An example use case: Enhanced builder pattern
An example of the benefits we can see with metadata types is the builder pattern. With the standard builder pattern you know from Java, it's very easy to attempt to construct something only to get a runtime exception declaring that what you were trying to build is malformed. This is the builder pattern's greatest weakness over constructors, and it holds it back when it comes to Scala.
While Scala provides a great deal of concepts that help reduce the need for the builder pattern (curried constructors, named parameters, default parameters), the imperfect synergy between these features sometimes necessitates the builder pattern still.
Metadata types in Scala 3 can help us overcome the weaknesses in the traditional builder pattern, allowing us to use it safely when needed! Let's see how...
Imagine we're trying to build a storage system for data in our program. We have a common API for the storage:
- load(key: String): Data
- store(key: String, data: Data): Unit
Our storage can be hosted on a database (for which we need a jdbc uri), filesystem (for which we need a path), or on an sftp server (for which we need a uri). If we're using sftp, we need to provide a user name and password, if we're using a database we need to specify if we need a connection pool, and if so, the class of the connection pool. Finally, for all three forms of storage, we need to set whether it's cached (stores up changes and saves periodically), and if so, the frequency of fetching from the host.
Let's get started:
trait Data
trait StorageSystem:
def load(key: String): Data
def store(key: String, value: Data): Unit
class StorageSystemBuilder[M <: Tuple]
object StorageSystemBuilder:
sealed trait DatabaseStorage
sealed trait FilesystemStorage
sealed trait SFTPStorage
sealed trait Credentials
sealed trait ConnectionPoolInfo
sealed trait CacheInfo
We've set up the api for our storage system, a skeleton for the builder, and a set of metadata tags that we can use to know the state of our builder. Now lets add support for the first build path, file system storage.
import scala.compiletime.ops.boolean.{&&, ||}
import scala.compiletime.{constValue, error}
import scala.annotation.targetName
import java.nio.file.Path
class StorageSystemBuilder[M <: Tuple]:
import StorageSystemBuilder.*
def setStorage(p: Path): StorageSystemBuilder[
FilesystemStorage *: RemoveAll[M, Tuple.Concat[SFTPSpecific, DBSpecific]]
] = StorageSystemBuilder()
def setCacheInfo(cached: false): StorageSystemBuilder[CacheInfo *: M] =
StorageSystemBuilder()
@targetName("needsCache")
def setCacheInfo(
cached: true,
syncRateInMs: Long
): StorageSystemBuilder[CacheInfo *: M] = StorageSystemBuilder()
inline def build(): StorageSystem =
inline if constValue[
SetEquals[M, CompleteFS] || SetEquals[M, CompleteDB] ||
SetEquals[M, CompleteSFTP]
]
then
new StorageSystem:
def load(key: String): Data = ???
def store(key: String, data: Data) = ()
else error("Cannot build. The builder is currently in an incomplete state")
object StorageSystemBuilder:
type SFTPSpecific = (SFTPStorage, Credentials)
type DBSpecific = (DatabaseStorage, ConnectionPoolInfo)
type CompleteFS = (FilesystemStorage, CacheInfo)
type CompleteDB = (DatabaseStorage, ConnectionPoolInfo, CacheInfo)
type CompleteSFTP = (SFTPStorage, Credentials, CacheInfo)
type Remove[T <: Tuple, U] <: Tuple = T match
case U *: t => Remove[t, U]
case h *: t => h *: Remove[t, U]
case EmptyTuple => EmptyTuple
type RemoveAll[T <: Tuple, U <: Tuple] <: Tuple = U match
case h *: t => RemoveAll[Remove[T, h], t]
case EmptyTuple => T
type Contains[T <: Tuple, U] <: Boolean = T match
case U *: ? => true
case ? *: t => Contains[t, U]
case EmptyTuple => false
type IsSubsetOrEqualTo[T <: Tuple, U <: Tuple] <: Boolean = T match
case h *: t => Contains[U, h] && IsSubsetOrEqualTo[t, U]
case EmptyTuple => true
type SetEquals[T <: Tuple, U <: Tuple] = IsSubsetOrEqualTo[T, U] &&
IsSubsetOrEqualTo[U, T]
sealed trait DatabaseStorage
sealed trait FilesystemStorage
sealed trait SFTPStorage
sealed trait Credentials
sealed trait ConnectionPoolInfo
sealed trait CacheInfo
We've added a lot here so lets look at the individual pieces.
SetEquals
type Contains[T <: Tuple, U] <: Boolean = T match
case U *: ? => true
case ? *: t => Contains[t, U]
case EmptyTuple => false
type IsSubsetOrEqualTo[T <: Tuple, U <: Tuple] <: Boolean = T match
case h *: t => Contains[U, h] && IsSubsetOrEqualTo[t, U]
case EmptyTuple => true
type SetEquals[T <: Tuple, U <: Tuple] = IsSubsetOrEqualTo[T,U] && IsSubsetOrEqualTo[U,T]
SetEquals
is a type alias that can be used to see if two tuple types are set equivalent to each other. That is, if given the types (A, B, C, D)
and (A, A, C, D, B, D)
SetEquals
returns true
. (A,B,C)
and (A, B, D, C, A)
would return false
however. This type will be helpful for checking whether we have one of the valid states for our builder.
RemoveAll
type RemoveAll[T <: Tuple, U <: Tuple] <: Tuple = U match
case h *: t => RemoveAll[Remove[T, h], t]
case EmptyTuple => T
Given tuple types (A,B,C,D)
and (C,A,B)
, this match type will return Tuple1[D]
. This is used to clear the state of our builder in case we want to change it to a different form.
We've added the setStorage
method to the builder, which declares the builder to be building storage based on the file system, and clears other options that are not related to file system storage. This isn't strictly necessary, just a nicety. We've also declared two setCacheInfo
methods. One only accepts false, and turns off caching, the other only accepts true and a refresh timing parameter. This means that if you don't need caching you don't need to provide a refresh period, and if you do need caching, you don't have to pass the refresh period in a Some
value.
We've also added our build method, and it uses compile-time or plus SetEquals
to determine if the builder is currently in one of the three acceptable states, and if not, produces a compile-time error.
Now lets add the machinery for the database hosting:
@targetName("dbStorage")
def setStorage(p: JDBCUri): StorageSystemBuilder[
DatabaseStorage *: RemoveAll[M, SFTPSpecific]
] = StorageSystemBuilder()
inline def noConnectionPool: StorageSystemBuilder[ConnectionPoolInfo *: M] =
inline if constValue[Contains[M, DatabaseStorage]]
then StorageSystemBuilder()
else error("This setting can only be used with DatabaseStorage")
inline def setConnectionPool(
connectionPoolClass: Class[?]
): StorageSystemBuilder[ConnectionPoolInfo *: M] =
inline if constValue[Contains[M, DatabaseStorage]]
then StorageSystemBuilder()
else error("This setting can only be used with Database storage")
The changes are relatively few this time. We add a new overload to setStorage
accepting a JDBCUri
, and this overload removes SFTP specific options from the metadata. There are no filesystem specific options, so none are cleared. In the noConnectionPool
and setConnectionPool
methods, compile-time errors occur if they are used without database storage.
Finally, we add the machinery for the SFTP storage:
@targetName("sftpStorage")
def setStorage(p: URI): StorageSystemBuilder[
SFTPStorage *: RemoveAll[M, DBSpecific]
] = StorageSystemBuilder()
@targetName("sftpStorageComplete")
def setStorage(p: URI, username: String, password: String): StorageSystemBuilder[SFTPStorage *: Credentials *: RemoveAll[M, DBSpecific]] = StorageSystemBuilder()
inline def setCredentials(username: String, password: String): StorageSystemBuilder[Credentials *: M] =
inline if constValue[Contains[M, SFTPStorage]]
then StorageSystemBuilder()
else error("Builder must be in sftp storage mode to use `setCredentials`")
This is much the same as the DB pieces, except we've added a 4th overload of setStorage
that accepts credentials immediately, and which adds the Credentials
metadata type to our metadata.
We've now got our builder set up, let's see it in action!
StorageSystemBuilder.init.setStorage(JDBCUri()).noConnectionPool.setCacheInfo(false).build()
StorageSystemBuilder.init.noConnectionPool //compile error: "This setting can only be used with DatabaseStorage"
StorageSystemBuilder.init.setStorage(Paths.get("bar")).setCredentials("foo", "baz") //compile error: Builder must be in sftp storage mode to use `setCredentials`
StorageSystemBuilder.init.setStorage(JDBCUri()).noConnectionPool.build() //compile error: Cannot build. The builder is currently in an incomplete state
As you can see, our builder is now compile-time checked against invalid configurations. The set-up chosen for this example is purely optional, and you can do things much differently, but this usage of metadata types really strengthens builders.
Here's the complete source code for the builder example:
import scala.compiletime.ops.boolean.{&&, ||}
import scala.compiletime.{constValue, error}
import scala.annotation.targetName
import java.nio.file.Path
import java.net.URI
trait Data
trait StorageSystem:
def load(key: String): Data
def store(key: String, value: Data): Unit
class JDBCUri
class StorageSystemBuilder[M <: Tuple]:
import StorageSystemBuilder.*
def setStorage(p: Path): StorageSystemBuilder[
FilesystemStorage *: RemoveAll[M, Tuple.Concat[SFTPSpecific, DBSpecific]]
] = StorageSystemBuilder()
@targetName("dbStorage")
def setStorage(p: JDBCUri): StorageSystemBuilder[
DatabaseStorage *: RemoveAll[M, SFTPSpecific]
] = StorageSystemBuilder()
@targetName("sftpStorage")
def setStorage(p: URI): StorageSystemBuilder[
SFTPStorage *: RemoveAll[M, DBSpecific]
] = StorageSystemBuilder()
@targetName("sftpStorageComplete")
def setStorage(p: URI, username: String, password: String): StorageSystemBuilder[SFTPStorage *: Credentials *: RemoveAll[M, DBSpecific]] = StorageSystemBuilder()
inline def setCredentials(username: String, password: String): StorageSystemBuilder[Credentials *: M] =
inline if constValue[Contains[M, SFTPStorage]]
then StorageSystemBuilder()
else error("Builder must be in sftp storage mode to use `setCredentials`")
inline def noConnectionPool: StorageSystemBuilder[ConnectionPoolInfo *: M] =
inline if constValue[Contains[M, DatabaseStorage]]
then StorageSystemBuilder()
else error("This setting can only be used with DatabaseStorage")
@targetName("usingConnectionPool")
inline def setConnectionPool(
connectionPoolClass: Class[?]
): StorageSystemBuilder[ConnectionPoolInfo *: M] =
inline if constValue[Contains[M, DatabaseStorage]]
then StorageSystemBuilder()
else error("This setting can only be used with Database storage")
def setCacheInfo(cached: false): StorageSystemBuilder[CacheInfo *: M] =
StorageSystemBuilder()
@targetName("needsCache")
def setCacheInfo(
cached: true,
syncRateInMs: Long
): StorageSystemBuilder[CacheInfo *: M] = StorageSystemBuilder()
inline def build(): StorageSystem =
inline if constValue[
SetEquals[M, CompleteFS] || SetEquals[M, CompleteDB] ||
SetEquals[M, CompleteSFTP]
]
then
new StorageSystem:
def load(key: String): Data = ???
def store(key: String, data: Data) = ()
else error("Cannot build. The builder is currently in an incomplete state")
object StorageSystemBuilder:
val init: StorageSystemBuilder[EmptyTuple] = StorageSystemBuilder()
type SFTPSpecific = (SFTPStorage, Credentials)
type DBSpecific = (DatabaseStorage, ConnectionPoolInfo)
type CompleteFS = (FilesystemStorage, CacheInfo)
type CompleteDB = (DatabaseStorage, ConnectionPoolInfo, CacheInfo)
type CompleteSFTP = (SFTPStorage, Credentials, CacheInfo)
type Remove[T <: Tuple, U] <: Tuple = T match
case U *: t => Remove[t, U]
case h *: t => h *: Remove[t, U]
case EmptyTuple => EmptyTuple
type RemoveAll[T <: Tuple, U <: Tuple] <: Tuple = U match
case h *: t => RemoveAll[Remove[T, h], t]
case EmptyTuple => T
type Contains[T <: Tuple, U] <: Boolean = T match
case U *: ? => true
case ? *: t => Contains[t, U]
case EmptyTuple => false
type IsSubsetOrEqualTo[T <: Tuple, U <: Tuple] <: Boolean = T match
case h *: t => Contains[U, h] && IsSubsetOrEqualTo[t, U]
case EmptyTuple => true
type SetEquals[T <: Tuple, U <: Tuple] = IsSubsetOrEqualTo[T, U] &&
IsSubsetOrEqualTo[U, T]
sealed trait DatabaseStorage
sealed trait FilesystemStorage
sealed trait SFTPStorage
sealed trait Credentials
sealed trait ConnectionPoolInfo
sealed trait CacheInfo
Happy Scala hacking!