Types
Soufflé’s type system is static. Types are used to define the attributes of a relation. Static types have the advantage that they can be enforced at compiletime; evaluation speed is improved as no dynamic checks are needed. Another advantage of the type system is that it supports the programmer by enforcing the correct use of relations.
The Type system is based on the idea of sets. A Type can be either the universe of all possible values, or a subset. Subsets can be composed of other subsets.
Primitive Types
Soufflé has two primitive types:
 Symbol type:
symbol
 Number type:
number
Symbol type
The symbol type consists of the universe of all strings.
Internally, the symbol type is represented by an ordinal number.
The ordinal number for a symbol can be determined by using the ord
command, e.g., ord(“hello”) represents the ordinal number for “hello”.
Number type
The number type consists of the universe of all numbers The accepted range of numbers is restricted by architecture. Soufflé accepts 32 bit signed numbers.
Primitive type usage
decl Name(n: symbol )
Name("Hans").
Name("Gretl").
.decl Translate(n: symbol , o: number )
.output Translate
Translate(x,ord(x)) : Name(x).
Beyond Primitive Types
Primitive types are insufficient for large projects. There is a need for a way to ensure that the wrong attributes will not be bound. Symbols of different types need to be differentiated. The ability to form partial orders over subsets allows the formation of ontologies.
Base Type
Symbol types for attributes are defined by the .symbol_type
declarative, e.g.,
.symbol_type City
.symbol_type Town
.symbol_type Village
Here we have defined distinct sets of symbols from the universe of possible symbols.
Union Type
The Union type unifies a fixed number of symbol set types, of either base or union types.
.type <ident1> = <ident1>  <ident2>  ...  <identk>
For example,
.type Place = City  Town  Village
We can bring these together to define attributes that better describe a relation, e.g.,
.symbol_type City
.symbol_type Town
.symbol_type Village
.type Place = City  Town  Village
.decl Data(c:City, t:Town, v:Village)
Data(“Sydney”, ”Ballina”, “Glenrowan”).
.decl Location(p:Place)
.output Location
Location(p) : Data(p,_,_); Data(_,p,_); Data(_,_,p).
Domains
The semantic of types is defined by their associated domains. Let T be a type and D(T) denote its associated domain. The domains are given by the following rules:
 D(number) = D(uint32_t) (or altered by compiler flag)
 D(symbol) is the set of all strings over the set of printable characters
 D(U) ⊂ D(number) if U is defined by
.number_type U
 D(S) ⊂ D(symbol) if S is defined by
.symbol_type S

D( T1 T2 … Tn ) ⊇ D(T1) ∪ D(T2) ∪ … ∪ D(Tn)  D( [ f1 : T1 , f2 : T2 , … , fn : Tn ] ) ⊇ D(T1) x D(T2) x … x D(Tn)
Furthermore, for two primitive types U and S we have that
 D(U) ∩ D(S) = ∅
Thus, by definition, no two primitive userdefined types U and S may have common elements. Consider the following type definitions:
.number_type weight
.number_type length
For both types, weight and length, the value 30 is a valid value. However, an expression of value 30 and type weight
is not considered equivalent to an expression of value 30 and type length
. Both values may be considered to be tagged by their type and the type is essential for equivalence.
Consequently, with the Datalog program
.number_type weight
.number_type length
.decl A(w:weight)
.decl B(l:length)
A(X) : B(X).
the trailing rule will cause a typing error since no value of the first attribute of B will ever be in the domain of the first attribute of A, since D(weight) ∩ D(length) = ∅.
This interaction between types is generalized by the subtype relation to be covered next.
Subtyping
The subtype relation among types provides a foundation for determining whether there might be values in the intersection between (potentially infinite) domains of types. Intuitively, a type U is a subtype of S, denoted by U <: S, if all elements in the domain of U are also to be found in the domain of S. The following rules define the subtype relation:
The subtype relation is reflexive and transitive:
 S <: S for all types S
 if S <: T and T <: U we have S <: U for all types S, T, and U
All primitive types are subtypes of their predefined base type:
 U <: number if U is defined by
.number_type U
 U <: symbol if U is defined by
.symbol_type U
Thus, every value of a primitive type, e.g. the information that a weight is 40, can ‘decay’ to an ordinary number without the implicit typetag of being a weight.
Finally, for union types the following rules hold:

U <: T1 … Tn if there is a 1 ≤ i ≤ n such that U <: Ti 
T1 … Tn <: U if for all 1 ≤ i ≤ n we have Ti <: U 
U1 … Un <: T1 … Tm if for all 1 ≤ i ≤ n there is a 1 ≤ j ≤ m such that Ui <: Tj
The first two rules define the relation between union types and nonunion types and the third rule defines the relation between two union types. Note that a union type consisting of a single type is a subtype of this type as well as the other way around.
There are no subtyping relations for record types. Each record type constitutes its own, distinct domain. Thus, no element of one domain may be found in the domain of any other type’s domain.
Pitfalls
As an example, consider the following code fragment:
.type even = number
.type odd = number
.decl A(x:even)
.decl B(x:odd)
A(X) : B(X)
In this example, the types even
and odd
are defined as union types over the single type number
. Thus, they are effectively equivalent and no typing error will result from the rule stated for relation A
.
However, when the aim of defining the two types even
and odd
was to enforce that those are two disjunctive sets of integer values, the rule stated for A
should trigger at least a warning since no X
should be within both – the even
and odd
domain.
The problem is caused by utilizing a degenerated version of a union for the definition of the new types. A more accurate modeling is shown here:
.number_type even
.number_type odd
.decl A(x:even)
.decl B(x:odd)
A(X) : B(X)
In this version, even
and odd
are defined as two disjunctive subtypes of the numbertype, each exhibiting its own domain of values. From this definition Souffle can deduce that the users intention was to define two disjunctive sets and will report an error for the rule in the last line.
The type system is designed to support the developer of a Datalog program in enforcing constraints on the ‘dataflow’ within the resulting program. However, to enable the system to do so, an accurate modeling of the categories of values in form of a type system is required – as has been illustrated by the example above.