Enumeration Extensions

Kenneth Baclawski
College of Computer and Information Science
Northeastern University

Enumerations are supported in all major programming languages as well as major representational languages such as XML Schema and OData CSDL. This article addresses the question of whether one can define a notion of extension or derivation for enumerations. In other words, whether one can support a mechanism for reusing a base enumeration by adding additional members. We consider three possible approaches and discuss the advantages and disadvantages of each approach. We also show how one could extend OData CSDL and Java to allow for deriving enumerations from other enumerations.

Section 1. Possible Approaches for Deriving Enumerations

  1. Extension. The list of members is extended with new members. Unfortunately, the derived enumeration cannot be a subclass because it does not satisfy substitutability. Here is an example:

    enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday }
    

    enum Day extends Weekday { Saturday, Sunday }
    

    In this case, Day cannot be a subclass of Weekday because a Day variable cannot be substituted in a Weekday context. Consequently, in this approach, the enumeration types Weekday and Day must be disjoint. The disadvantage is that one now has two independent notions of the base members. For example, there are now two different Monday members that can (indeed, should) be implemented incompatibly. This clashes with programming languages, such as C, C++, and C#, that regard members as being symbolic constants that can be explicitly specified. It also clashes with representation languages such as OData CSDL which can serialize a member of an enumeration using either the name or the underlying value.

  2. Mapping. In this approach one must map the additional members to base members. Here is an example:

    enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday, Other }
    

    enum Day maps to Weekday { Saturday => Other, Sunday => Other }
    

    This approach has the advantage that it supports substitutability. However, it has some disadvantages.

    1. The base enumeration must be designed to allow for extension. Most enumerations do not have a "catch-all" member.
    2. There is added complexity for using the base enumeration even if it is never extended. In the example, one must now allow for the Other case even when it never occurs.
    3. Many programming languages, such as C, C++, and C# regard members as being symbolic constants. The same is true for OData CSDL. In these languages, one can explicitly override the default implementation of a member. One can specify both its value and its integral type. Mapping would break this model by causing the value of a constant to depend on its context. In other words, a member is no longer constant.

  3. Superclass. The derived enumeration is a superclass. Here is an example:

    enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday }
    

    enum Day superclass of Weekday { Saturday, Sunday }
    

    This satisfies substitutability so it is compatible with the use of an enumeration type as a parameter of a polymorphic method. It is compatible with the model in which the members are symbolic constants. It also does not require the base enumeration to be designed to allow for it to be derived.

    There isn't a name for this kind of derivation in major programming languages. There is also no notion of a parent obtaining anything from a child in a hierarchy or even a word for such a process in major programming languages. The words "inheritance", "derivation", and "extension" are always from parent to child. In XML Schema, one can define a restriction of a data type, including an enumeration, but one cannot extend an enumeration.

    There are terms for this kind of derivation in contexts other than major programming languages. In UML, derivation of a subtype is called "specialization", and the inverse of this relationship is called "generalization". In bibliographic hierarchies, the relationship for specialization is called "narrow", and the inverse relationship is called "widen".

    Using the term "extends" for adding members to an enumeration would be confusing because "extends" is already well-established for denoting subclasses, which is not the case for adding members to an enumeration. Accordingly, I propose to use the term "widen" which seems to be the easiest to understand. Here is the previous example using the widen term:

    enum Weekday { Monday, Tuesday, Wednesday, Thursday, Friday }
    

    enum Day widens Weekday { Saturday, Sunday }
    

    Here is how it would look in OData CSDL:

    <EnumType Name="Weekday">
      <Member Name="Monday"/>
      <Member Name="Tuesday"/>
      <Member Name="Wednesday"/>
      <Member Name="Thursday"/>
      <Member Name="Friday"/>
    </EnumType>
    
    <EnumType Name="Day" Widens="Weekday">
      <Member Name="Saturday"/>
      <Member Name="Sunday"/>
    </EnumType>
    

    The main disadvantage of this approach is that major programming languages do not have this concept, and it is unclear whether they would be open to adding it. This approach requires the introduction of a new keyword, "widen". It might be somewhat counter-intuitive because the derivation of the new enumeration produces a supertype rather than a subtype. Deriving a supertype from a subtype is not a process that occurs in major programming languages. However, in practice, it should be clear.

    There are a few minor issues that arise. In C, C++, C# and OData CSDL, one can specify the integral type of an enumeration. Given that widening defines a supertype, one could, in theory, specify the integral type of a widened type which would then cause the base enumeration to inherit the integral type specified by the widened type. This would break the implementation of the base enumeration. Accordingly, one should disallow specifying the integral type of a widening.

    Another minor issue that might arise with this approach is multiple inheritance. Not all programming languages allow multiple inheritance, so in such a context one would not be able to widen an enumeration more than once. Even when multiple inheritance is allowed, there are ambiguities that could occur when one widens an enumeration more than once. Here is an example using OData CSDL notation:

    <EnumType Name="Size">
      <Member Name="Small"/>
      <Member Name="Medium"/>
      <Member Name="Large"/>
    </EnumType>
    
    <EnumType Name="JumboSize" Widens="Size">
      <Member Name="Jumbo"/>
    </EnumType>
    
    <EnumType Name="ExtraSize" Widens="Size">
      <Member Name="XLarge"/>
      <Member Name="XXLarge"/>
    </EnumType>
    
    <EnumType Name="SuperSize" Widens="ExtraSize">
      <Member Name="Jumbo"/>
    </EnumType>
    

    In this case, there are two meanings for Jumbo. When viewed as symbolic constants, the two meanings of Jumbo have different values. In JumboSize, it has value 3, while in SuperSize it has value 5. Presumably, such an ambiguity could be detected and a suitable error message could be generated.

In conclusion, it appears that the best approach is the third one. The main disadvantage is the introduction of a new keyword and the small possibility of it being counter-intuitive for some programmers. There are also some minor issues, but these do not seem to be insurmountable.

Section 2. Extending OData CSDL to Allow Deriving Enumerations

This section shows how one could allow enumeration extension in OData CSDL. The main disadvantage for adding this feature to OData CSDL is that major programming languages do not have a notion of enumeration extension so mapping this feature to a major programming language might be problematic. However, all of the approaches would be problematic for mapping to a programming language. The third approach is arguably the one with the fewest difficulties.

As an example of how to deal with this for a major programming language, consider Java. A widened Java enumeration would be mapped to an independent enum type with both the members of the base enumeration and the members added by the widening. It would also be desirable to add a method to the base enum type that converts an instance of the base enum type to an instance of the widened enum type. Polymorphism can be supported by having the two enum types implement a common interface. The methods of the interface would invoke the conversion method to convert any base members to the widened enum type as, for example, the approach of Yifan Peng in How to extend enum in Java

The following are the changes one would need to make to OData CSDL to allow one to widen an enumeration type.

  1. Add a new attribute to edm:EnumType:

    8.1.3 The edm:Widens Attribute
    

    An enumeration type may specify that it widens another enumeration type. This adds additional members to the other enumeration type. The enumeration type being widened is called the base enumeration type. Widening satisfies substitutability. A member of the base enumeration type is also a member of the widened enumeration type. A widened enumeration type has the same values for the edm:IsFlag and edm:UnderlyingType attributes as the base enumeration type. A widened enumeration type MUST NOT specify values for these two attributes.

  2. Add the following to the definition of the edm:Name attribute of edm:Member (section 8.2.1):

    An enumeration type that widens another enumeration type MUST NOT have a member with the same name as the enumeration type that is being widened. Multiple widenings of the same base enumeration type are allowed, but MUST NOT have members in different widenings with the same name.

  3. Add the following to the definition of the edm:Value attribute of edm:Member (section 8.2.2):

    If the enumeration type widens another enumeration type, then the values for the members of the widened enumeration type are determined as if the list of members of the widened enumeration type consists of the list of members of the base enumeration type followed by the list of members of the widened enumeration type. For example,

    <EnumType Name="ExtendedShippingMethod" Widens="ShippingMethod">
      <Member Name="Economy"/>
    </EnumType>
    

    defines an enumeration type type with four members: FirstClass, TwoDay, Overnight, and Economy, in this order. In this example, Economy MUST be assigned a value of 6 since Overnight is assigned a value of 5.

Section 3. Extending Java to Allow Deriving Enumerations

There have been suggestions for how one can simulate extending enumerations in Java, such as Yifan Peng in How to extend enum in Java. However, these approaches require one to modify the base enum in some fashion. This section shows how one could allow enumeration extension in Java without modifying the base enumeration. Unlike programming languages that treat enumerations as being a collection of symbolic constants, Java enums are classes that have fields, constructors and methods. Consequently, there are a number of issues that must be resolved to allow enumeration extension:

  1. Fields in an enumeration. Since a widening is a superclass, it can only have fewer fields, not more fields, than the base enumeration. As there is no syntax in Java for removing a field, the most natural approach is for the widening to have exactly the same fields as the base enumeration, since adding additional fields is not possible, as already noted. Access to fields is governed by access specifiers (public, private, protected, default), and there is no reason to change the meaning of these.

  2. Constructors in an enumeration. Constructors are not inherited, so they can be defined as usual. Unlike constructors in general, the constructors for enumerations are only invoked when the members are defined in the enumeration. No explicit invocation of an enum constructor with the "new" keyword is allowed. As enumerations cannot currently be explicitly subclassed, no existing code would be broken if access to a superclass constructor is forbidden (using the keyword "super").

  3. A widening could have methods. Since the widening is a superclass of the base enumeration, it would seem natural for the base enumeration to inherit the methods of the widening, unless there is a method in the base enumeration which has the same signature and therefore overrides the method in the widening. However, this presents a serious problem. Consider this example of two enum types in the same package:

    enum Weekday {
      MONDAY("Monday"), 
      TUESDAY("Tuesday"), 
      WEDNESDAY("Wednesday"), 
      THURSDAY("Thursday"), 
      FRIDAY("Friday");
      final String name;
      Weekday(String name) {
        this.name = name;
      }
      public String getName() {
        return name;
      }
    }
    
    enum Day widens Weekday { 
      SATURDAY("Saturday"), 
      SUNDAY("Sunday");
      Day(String name) {
        this.name = name;
      }
      public boolean isWeekend() {
        switch (this) {
        case SATURDAY: case SUNDAY:
          return true;
        default:
          return false;
      }
    }
    

    Now suppose that v is a variable of type Weekday. The expression v.isWeekend() would result in a compile-time error if the Day enum was not known to the compiler, but would be acceptable if the Day enum was known during compile-time. Rather than allow such an indeterminacy, one should disallow expressions such as v.isWeekend(). If one wishes to invoke this method on v, one should first convert it to have type Day, as in the following:

    Day w = v;
    System.out.println(w.isWeekend());
    

  4. Since Java does not support multiple inheritance, a Java enum cannot be widened more than once.

There are also some features of Java enumerations that would not present any problems:

  1. The implicitly defined static method values() returns an array of the members of the enumeration. There should not be any difficulty supporting this for a widening of a base enumeration.

  2. A switch statement on an expression whose type is an enumeration type has a well-defined range of values that is exactly the same as that returned by the static method values().

In conclusion, the following would need to be added to the Java language standard:

  1. An enumeration type MAY specify that it widens another enumeration type. This adds additional members to the other enumeration type. The enumeration type being widened is called the base enumeration type. Widening satisfies substitutability. A member of the base enumeration type is also a member of the widened enumeration type. In other words, the base enumeration type may be regarded as a subclass of the widened enumeration type, except as noted below.

  2. A widened enumeration has the same fields as the base enumeration, but the usual access restrictions apply. In particular, a private field cannot be accessed outside the base enumeration.

  3. A widened enumeration MAY have constructors. Such constructors MUST NOT be invoked by the base enumeration.

  4. A widened enumeration MAY have methods, but such methods MUST NOT be invoked by an expression whose type is the base enumeration.

  5. An enumeration type that widens another enumeration type MUST NOT have a member with the same name as the enumeration type that is being widened.

  6. An enumeration type MUST NOT have more than one widening because this would result in multiple inheritance.

  7. If an enumeration type widens another enumeration type, then the ordinal values for the members of the widened enumeration type MUST be determined as if the list of members of the widened enumeration type consists of the list of members of the base enumeration type followed by the list of members of the widened enumeration type.

  8. Applying the getSuperclass() method of Class<T> to an enumeration type that has been widened MUST return the class object of the widened enumeration rather than Enum<E extends Enum<E>>.