SGEN Data Structure Generator User's Manual

Revision: $CambiosHeader: sgen3/sgen/sgenconv.htm,v 3.1 2021/04/15 23:37:35 cambios Exp $
Jonathan W. Greene
sgeninfo@cambioscomputing.com

Copyright (c) 2000-2002 Jonathan W. Greene.
Copyright (c) 2006-2009 Cambios Computing, LLC.

Contents

Overview
Background
Writing the Input File
Running Sgen
Using the Generated Code
Glossary
Customization
Implementation Details
Licensing
Acknowledgements

Overview

SGEN is a program that generates C++ code for data structures from a high-level specification. The data are composed of objects (instances of classes). Each object may have attributes and relationships to other objects.

An attribute is a named property of an object whose value is some primitive type such as an integer or string, or a vector or map of primitive types. Operations are provided to get and reset the attribute's value(s).

A relationship is a link between a parent object and one or more child objects. Examples include:

Sets of closely related objects are grouped into modules. (A module generally corresponds to a library in the link hierarchy.)

The input to SGEN is a terse file specifying the module name(s), primitive types, classes of object, attributes and relationships. SGEN generates a C++ class for each object. Each attribute and relation gives rise to a set of supporting methods, as well as private data members to store the necessary data. All necessary implementation code is automatically created (not just the method declarations).

For example, suppose we have a module called Db with two objects, Building and Room. There is a relationship between the two such that each Building (the parent) has a vector of Rooms (the children). The building keeps a count of how many rooms it has. Also, a room can find its owning building and its index (room number) in the building. We put the following line in the SGEN input file:

Vector("DbBuilding", "DbRoom", Counted=1, GetOwner=1, GetIndex=1)
SGEN will then generate declarations and implementation code for the following methods:
class DbBuilding {
public:

  /** Returns child Room with index x, or null if none. */
  DbRoom * findRoom(int x) const;
  /** Attach given child Room, chld, under index x. */
  void attachRoom(int x, DbRoom * chld);
  /** Detach and return child Room attached under index x. It is assumed that there is one. */
  DbRoom * dxRoom(int x);
  /** Detach chld. */
  void detachRoom(DbRoom * chld);
  /** Returns current size of vector. */
  inline int sizeRoom() const;
  /** Returns current storage capacity of vector. */
  inline int capacityRoom() const;
  /** Resizes vector. If n>capacity, capacity is increased to n as well. Assumes no children at locations x>=n. */
  void resizeRoom(int n);
  /** If n > capacity, raises capacity to n. If n == 0, resizes to 0 and frees storage. */
  void reserveRoom(int n);
  /** Number of child Rooms attached. */
  inline int numRoom() const;
  /** A const_iterator to first child Room. */
  inline DbBuildingRoom_Itr beginRoom() const;
  /** A const_iterator after last child Room. */
  inline DbBuildingRoom_Itr endRoom() const;
  ...
};

class DbRoom {
public:

  /** Parent Building to which this child Room is currently attached, or null if none. */
  inline DbBuilding * ownerBuilding() const;
  /** Returns index of this child Room in its parent Building. */
  inline int indexBuilding() const;
  ...
};

The various types DbRoom *, etc. are typedefs for pointers to the corresponding objects. The various types DbBuildingRoom_Itr, etc. are typedefs for const_iterators to walk a set of child objects (e.g. DbRoom *).

The methods are named according to a systematic convention, described in more detail below. If you wish, you can define your own conventions using SGEN's configuration file.

Background

For the curious, this section provides some background about SGEN and code generators in general. Others can skip to the next section.

Advantages

Disadvantages

Relation to Other Tools

Writing the Input File

This section describes how to describe your data structures to SGEN. If you are only interested in using code that someone else already generated, you can skip to the next section.

Overview

The input file to SGEN describes the desired data structures using several statements:

A few other matters:
Foreign Objects
Adding/Overriding Methods using your own Subclass (Baseonly Option)
Sizing of Vectors
Constructors and Destructors
Copying Objects

Some general syntax notes:

The input syntax is embedded in the Python scripting language. Advanced users may take advantage of Python's constructs for looping, variables, etc. However no knowledge of Python is required to use SGEN.

Primitive Types

To describe an available primitive type to SGEN, use the statement:

Prim(name, options)

where name is the name of the type.

Examples:

Prim("int")
Prim("float")
Prim("double")
Prim("long")
Prim("char")
Prim("unsigned int", InitVal=0)
Prim("unsigned char", InitVal=0)
Prim("string", HasConstructor=1, IncludeFile="sgenstring.hpp")

Note: The include file sgenstring.hpp takes care of bringing std::string into global namespace.

Descriptions of Options Applicable to Primitive Types


Angles
If 1, enclose include file for a primitive type in <> rather than "". Use in conjunction with IncludeFile option.
Legal values: 0, 1.
Default: 0

ExpectedSize
The expected size of the object in bytes.
Default: 4

HasConstructor
If 1, this type is assumed to have a constructor. If 0, SGEN won't rely on that.
Legal values: 0, 1.
Default: 0

IncludeFile
String giving name of include file needed for a primitive type, e.g. "string.h".
Default: None

InitVal
If this type lacks a constructor and even pseudo-constructor (e.g. int(), float(), etc.), specify initial value to be used for variables of this type.
Default: constructor or pseudo-constructor used to set initial value

PassAsRef
If 1, pass attributes of this type as a const reference instead of as a value. This is useful for complex types whose size exceeds one word.
Legal values: 0, 1.
Default: 0

Modules

To declare an module (set of related objects):

Module(name, options)

where

Examples:

Module("Db")
Module("Db", Foreign=1)

All data structures constituting a given module must be described in one input file and generated in one execution of SGEN. (One module cannot be described in multiple input files, but one input file can describe multiple modules).

Descriptions of Options Applicable to Modules


Foreign
If 1, treat this module as foreign. It will be assumed that all objects, etc. in this module were generated separately. Note that no additional data can be added to foreign objects.
Legal values: 0, 1.
Default: 0

Pack
True to pack data members in a bit field when possible unless otherwise specified.
Legal values: 0, 1.
Default: Unpacked

Enumerated Types

To declare an enumerated type:

Enum(module, name, valueList, options)

where

Examples:

Enum("Db", "State", ["Undef", "Ready", "Done"])
Enum("Db", "State", ["Undef", "Ready", "Done"], Comment="A possible state of node.")
Enum("Db", "State", [("Undef", "Undefined state"), ("Ready", "Ready to process"), "Done"])

Descriptions of Options Applicable to Enumerated Types


Comment
Comment describing this enumerated type.
Default: None

Objects

To declare an object class:

Object(name, module, options)

where

Examples:

Object("Node", "Db")
Object("Node", "Db", Virtual=1, Comment="A node in the graph")

Options applicable to Objects: Baseonly, Comment, ConstructorArgs, ConstructorVisibility, CopyConstructorVisibility, CopyDepth, Foreign, MinDepthToCopy, Pack, PrivateBases, ProtectedBases, PublicBases, Static, Virtual

Descriptions of Options Applicable to Objects


Baseonly
If 1, generate only base class for this object from which the user can derive a subclass.
Legal values: 0, 1.
Default: 0

Comment
Comment describing object.
Default: None

ConstructorArgs
A list of tuples specifying arguments to be accepted by the constructor of the generated class. Example: [('int', 'x'),('int', 'y', '0')]
Default: Constructor takes no arguments.

ConstructorVisibility
Visibility of constructor.
Legal values: public, private, protected.
Default: public

CopyConstructorVisibility
Visibility of copy constructor, if any. Ignored if CopyDepth=None.
Legal values: public, private, protected.
Default: same as ConstructorVisibility

CopyDepth
Type of copy constructor to be generated. See Copying Objects for details.
Legal values: Default: None

Foreign
If 1, treat this object as foreign. It will be assumed that this object was generated separately. No further data can be added to the object.
Legal values: 0, 1.
Default: 0

MinDepthToCopy
When deep-copying, stop at and do not copy or attach objects of this type if MinDepthToCopy exceeds the depth passed to the top-level copy constructor.
Default: 1

Pack
True to pack data members in a bit field when possible unless otherwise specified.
Legal values: 0, 1.
Default: Inherit option from module.

PrivateBases
Object or list of objects that are private base classes of this object.
Default: None

ProtectedBases
Object or list of objects that are protected base classes of this object.
Default: None

PublicBases
Object or list of objects that are public base classes of this object.
Default: None

Static
If 1, make all data owned by object static.
Legal values: 0, 1.
Default: 0

Virtual
If 1, make this object a virtual class. In this case methods called via a base class pointer will be implemented by the appropriate method of the actual subclass.
Legal values: 0, 1.
Default: 0

Attributes

To declare an attribute:

Attr(owner, name, attrType, container, options)

where

Examples:

Attr("DbNode", "Cost", "float", scalar, InitVal="0.0f")
Attr("DbPhonebook", "Number", "int", map, KeyType="string", InitVal=-1)

Options common to all attributes: Comment, DataVisibility, DefaultVal, InitVal, Max, Min, Pack, Static, Visibility, WriteVisibility

Additional options applicable to scalar attributes: None.

Additional options applicable to vector attributes: Sizing, SizingVisibility

Additional options applicable to map attributes: KeyType

Descriptions of Options Applicable to Attributes


Comment
Comment describing this attribute.
Default: None

DataVisibility
Visibility of data members relevant to this attribute.
Legal values: public, private, protected.
Default: private

DefaultVal
Default value to be returned for unknown key. Applicable to map attributes only.
Default: assert that key is known when retrieving values

InitVal
Initial value of attribute.
Default: usual initial value of attribute's type

KeyType
Type of key. Applicable to map attribute. May be a primitive, enum, or an object. If an object, map is keyed on pointers to the object.
Default: Not a map attribute.

Max
Maximum legal value. Applicable to integer-valued attributes only.
Default: infinity

Min
Minimum legal value. Applicable to integer-valued attributes only.
Default: negative infinity

Pack
True to pack value of this attribute in a bit field when possible. Applicable to scalar attributes only.
Legal values: 0, 1.
Default: Inherit option from object.

Sizing
How size of vector for this attribute is controlled.
Legal values: Default: manual (for vector attributes) or None (for non-vector attributes)

SizingVisibility
Visibility of methods that change size or capacity of vector attribute.
Legal values: public, private, protected.
Default: same as value of WriteVisibility option

Static
If 1, make data for this attribute static.
Legal values: 0, 1.
Default: 0

Visibility
Visibility all read-only methods relevant to this attribute.
Legal values: public, private, protected.
Default: public

WriteVisibility
Visibility of methods that change this attribute.
Legal values: public, private, protected.
Default: same as value of Visibility option

Relations

There is one statement for each supported type of relation:

Single Relation

To declare a single relation, in which a parent may have zero or one child:

Single(parent, child, options)

where

Examples:

Single("DbCountry", "DbCapital")

Options common to all relations: Comment, Counted, DataVisibility, Destruct, GetOwner, GetOwnerMethod, MaxNum, Mdestruct, MemberCopyFunc, MemberPrefix, MemberSubClassList, MemberSuffix, MemberVisibility, Name, OwnerPrefix, OwnerSuffix, Static, Visibility, WriteVisibility

Additional options applicable to single relations: None.

IList Relation

To declare an intrusive list relation, in which a parent has an ordered list of zero or more children:

IList(parent, child, options)

where

Examples:

IList("DbCountry", "DbState", Linkage=singly)
IList("DbCountry", "DbState", Linkage=doubly, Counted=1)

Options common to all relations: Comment, Counted, DataVisibility, Destruct, GetOwner, GetOwnerMethod, MaxNum, Mdestruct, MemberCopyFunc, MemberPrefix, MemberSubClassList, MemberSuffix, MemberVisibility, Name, OwnerPrefix, OwnerSuffix, Static, Visibility, WriteVisibility

Additional options applicable to ilist relations: Linkage

Vector Relation

To declare an vector relation, in which a parent has an indexed set of children stored in a vector:

Vector(parent, child, options)

where

Examples:

Vector("DbCountry", "DbState")
Vector("DbCountry", "DbState", Sizing=auto, GetIndex=1)

Conventions for sizing of vectors are described here.

Options common to all relations: Comment, Counted, DataVisibility, Destruct, GetOwner, GetOwnerMethod, MaxNum, Mdestruct, MemberCopyFunc, MemberPrefix, MemberSubClassList, MemberSuffix, MemberVisibility, Name, OwnerPrefix, OwnerSuffix, Static, Visibility, WriteVisibility

Additional options applicable to vector relations: GetIndex, IndexName, Sizing, SizingVisibility, Sort

Map Relation

To declare a map relation, in which a parent has a set of children indexed by arbitrary keys use the following statement:

Map(parent, child, keyType, options)

where

Examples:

Map("DbCountry", "DbState", "string")
Map("DbCountry", "DbState", "string", GetIndex=1, GetOwner=1)

Options common to all relations: Comment, Counted, DataVisibility, Destruct, GetOwner, GetOwnerMethod, MaxNum, Mdestruct, MemberCopyFunc, MemberPrefix, MemberSubClassList, MemberSuffix, MemberVisibility, Name, OwnerPrefix, OwnerSuffix, Static, Visibility, WriteVisibility

Additional options applicable to map relations: DefaultKeyIllegal, GetIndex, IndexName

Edge Relation

In an edge relationship, a node (the parent) has a set of zero or more edges (the children). Each edge may have two parent nodes. To declare an edge relation:

Edge(parent, child, options)

where

Examples:

Edge("DbNode", "DbEdge")
Edge("DbNode", "DbEdge", Sizing=auto)

Conventions for sizing of vectors used in edge relationships are described here.

Options common to all relations: Comment, Counted, DataVisibility, Destruct, GetOwner, GetOwnerMethod, MaxNum, Mdestruct, MemberCopyFunc, MemberPrefix, MemberSubClassList, MemberSuffix, MemberVisibility, Name, OwnerPrefix, OwnerSuffix, Static, Visibility, WriteVisibility

Additional options applicable to edge relations: Sizing, SizingVisibility

Descriptions of Options Applicable to Relations


Comment
Comment describing this relation.
Default: None

Counted
Maintain count of children in relationship.
Legal values: 0, 1.
Default: 0

DataVisibility
Visibility of data members relevant to this relation.
Legal values: public, private, protected.
Default: private

DefaultKeyIllegal
If true, default key value is illegal and indicates unattached members. Applicable to map relations. Typically used with string keys to indicate that empty string is illegal.
Legal values: 0, 1.
Default: 1 for string keys with GetIndex option, 0 otherwise.

Destruct
What to do with any remaining children when parent is destructed.
Legal values: Default: check

GetIndex
Support getting key or index from child. Applicable to vector and map relationships.
Legal values: 0, 1.
Default: 0, or, if IndexName option is specified, 1.

GetOwner
Recovery of parent from child object.
Legal values: Default: 0

GetOwnerMethod
Name of method to get owner, e.g. "getOwner".
Default: usual name generated according to conventions

IndexName
Turns on GetIndex option, but the name of the method to get key or index is changed so it appears to be getting an attribute with the specified name.
Default: access key or index in usual way

Linkage
Type of linkage in linked lists.
Legal values: Default: singly

MaxNum
Maximum allowable number of children attached at one time.
Default: infinity

Mdestruct
What to do when a child is destructed.
Legal values: Default: check if possible, otherwise neither check nor detach

MemberCopyFunc
This option specifies a special function that should be used to make a shallow copy of an object when deep copying its parent. The original object is assumed to be called childOrigX where X is the name of the child class. May be a string such as "myNewFunc(childOrigRoom)", or a dictionary mapping from objects to the appropriate such string for each. This option is applicable only if Destruct=delete and the parent class supports deep copy. See section on copying for further details.
Default: new(childOrigX, 0), where X name of the child class

MemberPrefix
Prefix added to member name when constructing relationship and method names.
Default: None

MemberSubClassList
List of possible subclasses of child class whose instances might be attached as children in this relation, and that should be copied as a subclass. Applicable only if Destruct=delete and the parent class supports deep copy. See Deep Copying Polymorphic Children for further details.
Default: copy children as if they were instances of the base class. If the MemberCopyFunc option is a dictionary, MemberSubClassList defaults to the objects that are its keys.

MemberSuffix
Suffix added to member name when constructing relationship and method names.
Default: None

MemberVisibility
Visibility of methods for this relation in member
Legal values: public, private, protected.
Default: same as value of Visibility option

Name
Name used for relationship in comments and documentation.
Default: concatenate names of parent and child objects, including any prefixes and suffixes

OwnerPrefix
Prefix added to owner name when constructing relationship and method names.
Default: None

OwnerSuffix
Suffix added to owner name when constructing relationship and method names.
Default: None

Sizing
How size of vector for this relation is controlled.
Legal values: Default: manual (for vector relations) or None (for non-vector relations)

SizingVisibility
Visibility of methods that change size or capacity of vector in this relation.
Legal values: public, private, protected.
Default: same as value of WriteVisibility option

Sort
If 1, provide method to sort children. Applicable to vector relationships.
Legal values: 0, 1.
Default: 0

Static
Make data in parent for this relation static.
Legal values: 0, 1.
Default: 0

Visibility
Visibility all read-only methods relevant to this relation.
Legal values: public, private, protected.
Default: public

WriteVisibility
Visibility of methods that change this relation.
Legal values: public, private, protected.
Default: same as value of Visibility option

Mutually Exclusive Relations

Sometimes an object may be a child of at most one of several mutually-exclusive relations. In this case, any relevant data members in the child (such as a pointer back to the owning parent or an index by which the parent refers to the child) may be safely shared by all such relations. This is the purpose of the Exclusion statement.

Consider the following example. Suppose a parent has separate lists of children, one for males, one for females:


Vector(Parent, Child, MemberPrefix='Male', GetOwner=1, GetIndex=1)
Vector(Parent, Child, MemberPrefix='Female', GetOwner=1, GetIndex=1)
SGEN will generate four methods in the child:

ownerParentMaleChild()
ownerParentFemaleChild()
indexParentMaleChild()
indexParentFemaleChild()
and the four corresponding data members. But since we know no child should appear in both vectors, we add an Exclusion statement:

maleRelation = Vector(Parent, Child, MemberPrefix='Male', GetOwner=1, GetIndex=1)
femaleRelation = Vector(Parent, Child, MemberPrefix='Female', GetOwner=1, GetIndex=1)
Exclusion([maleRelation, femaleRelation])
The argument is a list of the mutually exclusive relations. This causes SGEN to generate only two methods in the child, which are shared by both relations:

ownerParent()
indexParent()
and only the two corresponding data members.

Note that SGEN attempts to generate sensible names for the shared methods and data members in the child. Additional options (see below) are provided to override the default names.

Some additional points about Exclusion statements:

Descriptions of Options Applicable to Exclusions


MemberPrefix
Prefix added to member name when constructing method names.
Default: None

MemberSuffix
Suffix added to member name when constructing method names.
Default: None

Name
Name to use for this set of mutually exclusive relations, e.g. ParentPrefixChildSuffix.
Default: based on names of parent and child object and member prefix and suffix.

Dynamic Casting to a Subclass

When a subclass is derived from a virtual base class, one is often faced with the problem of how to obtain an instance of the subclass from an instance of the base class. In C++, this can be done with dynamic_cast(inst). Sgen's Dyncast statement provides an alternative implementation using virtual methods.

Example:

Dyncast(Pet, [Cat, Dog])

will generate the following methods on Pet and its subclasses:
/// If this Pet is a Cat, returns the Cat. Else returns 0.
virtual Cat * Pet::dyncastCat();
/// If this Pet is a Dog, returns the Dog. Else returns 0.
virtual Dog * Pet::dyncastDog();

Friends

As in C++, a class A must be a friend of class B for methods of class A to access protected and private members of class B.

To declare that friendObj is a friend of obj:

Friend(obj, friendObj)

Examples:

Friend("DbNode", "DbTree")

User-defined Methods

The UserMethod command allows you to add declarations for user-defined methods to an automatically generated class definition.

UserMethod(obj, retval, code, comment, rel, attr, Visibility)

where

Examples:

UserMethod("DbCountry", "static void", "initialize(int a=4) const", "Call to initialize.")

UserMethod("DbCountry", "inline DbState *", "firstState() const { return this->findState(); }",
"Find first State.", rel="DbCountryState")

UserMethod("DbNode", "double", "getCostAsDouble() const { return (double) this->getCost(); }",
"Gets Cost attribute as a double instead of float.", attr="Value", Visibility=protected)

If you need to add many methods to a class, you may find it easier to use the Baseonly feature.

Specifying a Version of the Input File

You can specify a version of the input file, which will appear in comments in the generated code:

Version(version)

where

Verbatim Code

The author of SGEN confesses that he has not thought of everything. For this reason, users may insert arbitrary code at various places in the generated files using the verbatim feature:

Verbatim(loc, what, code)

where

Examples:

Verbatim("public", "DbNode", "void GetNodeCost() { return this->getWeight() * 2; }")

If you are just adding methods to a class, it is usually better to use the UserMethod feature.

If you need to add a lot of stuff to a class, you may find it easier to use the Baseonly feature.

Foreign Objects

A class that was generated in a different run of SGEN (usually for another module) or written manually is considered foreign. No header file for the class is generated. Because the foreign class definition has already been generated, it cannot be changed. This means a relation with a foreign object as child cannot support the GetOwner or GetIndex options, or use intrusive containers such as Ilist. Classes cannot be made friends of foreign classes.

To specify that a particular object class is foreign, use the Foreign option for Objects. To specify that everything in a particular module is foreign, use the Foreign option for Modules.

Adding/Overriding Methods using your own Subclass (Baseonly Option)

Sometimes the user would like to supplement or supercede the generated methods with his own manually-written methods.

For example, suppose we have two objects, Company and Employee, with two relations: a map relation EmployeeByName, and a vector relation EmployeeByNumber. Every child should be entered in both relations. Since two separate attach methods are generated, we rely on the user to remember to enter each new child in both. We can avoid this problem by making the generated attach methods protected (with the WriteVisibility option), and deriving a subclass with one public method AttachEmployee that calls both protected methods.

A special option Baseonly is provided to facilitate the derivation of a manually-written subclass from the generated class. Continuing the above example, if this option is set for the Company object in module Db, SGEN will define a class DbCompany_ in a file dbcompany_.hpp (instead of the usual class DbCompany in a file dbcompany.hpp). The user can then derive a subclass DbCompany in a manually written file dbcompany.hpp, where both client code and other generated code will look for it. The generated class DbCompany_ should never be instantiated itself, only via the derived subclass.

Here is an example of what the manually-written dbcompany.hpp should look like.

#include "dbcompany_.hpp"
#ifndef dbcompany_hpp_included
#define dbcompany_hpp_included
class DbCompany : public DbCompany_ {
public:
...
};
#endif

Notice that the include of the automatically generated file preceeds the customary #ifdef. This is necessary to avoid problems with cyclic #includes.

Sizing of Vectors

Vectors may be used to store attributes and relationships. SGEN's model of vector sizing is generally consistent with that of the STL vector template. A vector's size is the number of locations that are available to use. A vector's capacity is the amount of storage that has been allocated. The capacity may sometimes be larger than the size, but can never be less. The usual STL methods are provided:

size
Returns the current size.
resize
Changes the current size to the specified value. If necessary, the capacity is increased as well. If the size is being reduced, any data above the new size is lost. In the case of attribute vectors, values above the new size are lost In the case of relation vectors, children above the new size are detached.
capacity
Returns the current storage capacity.
reserve
If the current capacity is less than the given value, it is increased to the given value. The size remains unchanged. If reserve is called with a value of zero, the vector is resized to 0 and all storage is freed. (This ability to free storage is an addition to the usual STL semantics).

A vector may be declared with either one of these two sizing conventions:

manual
The client must assure the size appropriately. Read or write accesses to locations beyond the current size will result in an assertion failure.
auto
The size will be automatically increased as needed when writing to the vector. When reading from a location beyond the current size, a default value is returned. The resize and reserve methods above are still available to be called manually.

Which sizing convention applies is determined by the Sizing option for attributes or the equivalent Sizing option for relations.

Constructors and Destructors

SGEN will create a constructor for each class (if one is needed). The constructor's main purpose is to initialize attributes that require it. Normally, the constructor takes no arguments.

SGEN generates a destructor for each class that requires one. If the class is declared as virtual (see section on virtual objects), the destructor function will be made virtual. When an object owning a vector attribute is destructed, the destructor frees any memory allocated to store the vector. When a parent object of a relationship is destructed, the behavior is dictated by the Destruct option. When a child object of a relationship is destructed, the behavior is dictated by the Mdestruct option. If these options are set appropriately, SGEN will generate code to recursively delete all child, grandchild, etc. objects when a parent is destructed, saving the programmer from the tedious task of writing this code.

Copying Objects

Upon request, sgen will generate a copy constructor for a class. (See the CopyDepth option.)

Depth of Copy

Any of these types of copy constructor may be requested:
None
A bogus private copy constructor is declared to prevent use of the default bit-wise copy.
shallow
Only the object itself and its attributes are copied. The fresh copy will have no children attached.
deep
The object, its attributes, and links to child objects are all copied. The children are handled in one of two ways. We say that a child is "owned" if it is automatically deleted when the parent is destructed, as indicated by the Destruct option being set to delete. A child is "non-owned" if it is automatically detached when the parent is destructed, as indicated by the Destruct option being set to detach. Owned children are recursively deep copied; then the copy of the child is attached to the copy of the parent. Non-owned children are not copied; the copy of the (parent) object points to the same child objects that the original object did.
both
A second argument to the copy constructor can be used to select shallow or deep, as follows:
Obj::Obj(const Obj &orig, bool deep);

Prerequisites for Deep Copying

For deep copying of an object to be well-defined, two conditions must be met:

If either condition is violated for an object you have requested (either directly or indirectly) be deep copyable, Sgen will issue an error.

Copying from a Base Class to a Subclass

Suppose you need the following copy constructor, where Zebra is (as any toddler knows) a subclass of Animal:

Zebra::Zebra(const Animal &orig);

Instruct the generator to generate a copy constructor for Animal (by setting the CopyDepth option on Animal). This will generate something like:

Animal::Animal(const Animal &orig);

Then you can manually add a special copy constructor for Zebra as follows:

Zebra::Zebra(const Animal &orig) : Animal(orig)
{ // Provide code to initialize Zebra-specific data here }

Deep Copying Polymorphic Children (Children of Various Subclasses)

Sometimes a parent may have an association through which children of various subclasses are owned. For instance, a parent object Zoo may have a list of owned Animals, each of which may actually be an instance of subclass Bear or Elephant. By default, the deep copy constructor for Zoo will copy all the children as Animals, rather than Bears or Elephants. If you wish that the copies be instances of Bear or Elephant as appropriate, you must tell SGEN the set of relevant subclasses using the MemberSubClassList option. For instance:

List(Zoo, Animal, MemberSubClassList=[Bear, Elephant])

The classes in the MemberSubClassList must appear in an order so no class is preceeded by any of its base classes. The common base class (e.g. Animal in our example) may appear in the MemberSubClassList (but if so, it must be last). No class may appear more than once. (An error will be issued if any of these rules is violated).

Use of this option requires that the member base class (Animal) be virtual so that dynamic casting can be used to determine the correct subclass of the member instance.

Running Sgen

Once you have prepared an input file, you can run Sgen as follows:
sgen [--conv convFile] [--man] [--doc] [infile]

The arguments do the following

--conv
Reads an optional conventions file. See Customization for details.
--man
Generates a version of this html user manual modified in accordance with the specified conventions.
--doc
Generate html documentation for the generated code. Requires that you have the program doxygen in your path.
infile
is the input file prepared as specified in the previous section.

SGEN produces the following files:

Source (cpp) files calling the generated code will need to include the types file for the relevant module(s) and the header files for only those classes whose methods are invoked. (Not needing to include header files for classes you point to but don't call methods of can save on compile time).

Using the Generated Code

This section describes how to use code generated by SGEN in your programs.

Naming Conventions

The default naming conventions used by SGEN are as follows. (You can alter the conventions using the SGEN configuration file described in Customization below.)

A typical name for a class of objects is:

DbCountry
where
Db
is the name of the module (group of related objects).
Country
is the kind of object the class represents.
A typical method you might find on the DbCountry class is:
firstCity()

where

DbCountry
is the class owning the method
first
is the name of the operation (get the first child of a list). Each operation is identified by a short key word. Other common operators are "next" to get the next child, "owner" to get the parent from a child, "get" to get an attribute value of an object and "set" to set an attribute.
City
is the kind of child object.
There may also be methods on the child class, DbCity. For instance, to get back to the parent (owner) Country we would use:
ownerCountry()
In all cases, the portion of the name before the underscore indicates the class containing the method.

In some special situations, a distinctive prefix or suffix must be added to the child class name. Here are some examples.

Scalar Attributes

A scalar attribute is a single scalar value.

Methods Generated for Scalar Attributes

set
Example: prnt->setCost(val)
Set value to val.
get
Example: prnt->getCost()
Returns value.

Vector Attributes

A vector attribute has values indexed by an integer and stored in a vector container. Conventions for sizing of vectors used in attributes are described below.

Methods Generated for Vector Attributes

set
Example: prnt->setCost(x, val)
Set value for given index x to val.
get
Example: prnt->getCost(x)
Returns value for given index x.
size
Example: prnt->sizeCost()
Returns current size of vector.
resize
Example: prnt->resizeCost(n)
Resizes vector. If n>capacity, capacity is increased to n as well. Values in locations x>=n are lost.
capacity
Example: prnt->capacityCost()
Returns current storage capacity of vector.
reserve
Example: prnt->reserveCost(n)
If n > capacity, raises capacity to n. If n == 0, resizes to 0 and frees storage.
begin
Example: prnt->beginCost()
A const_iterator to first value.
end
Example: prnt->endCost()
A const_iterator after last value.

Map Attributes

A map attribute has values indexed by an arbitrary key and stored in a map container.

Methods Generated for Map Attributes

set
Example: prnt->setCost(x, val)
Set value for given key x to val.
get
Example: prnt->getCost(x)
Returns value for given key x. Assumes a value has been entered under x (unless default value specified with DefaultVal option).
find
Example: prnt->findCost(x)
Returns a pointer to value under given key x, or null if none.
erase
Example: prnt->eraseCost(x)
Erases entry with given key x and returns 1, or returns 0 if no such entry.
begin
Example: prnt->beginCost()
A const_iterator to first [key, value] pair.
end
Example: prnt->endCost()
A const_iterator after last [key, value] pair.

Macros Generated for Map Attributes

Loop with index
Example: FOR_DbNodeCost_Index(o,v,x)
Loop over all values v of object o, also finding each value's corresponding index x

Single Relation

A single relation is one in which the parent has at most one child.

Methods Generated for Single Relation

attach
Example: prnt->attachRoad(chld)
Attach chld. Assumes no child currently attached.
find
Example: prnt->findRoad()
Returns child object, or null if none.
detach
Example: prnt->detachRoad()
Detach and return currently attached child object. It is assumed that there is one.
num
Example: prnt->numRoad()
Number of child objects attached.
Supported only with Counted option.
owner
Example: chld->ownerTown()
Parent object to which this child object is currently attached, or null if none.
Supported only with GetOwner option.

Macros Generated for Single Relation

Loop over children
Example: FOR_DbTownRoad(p, c)
Loop over all children c of parent p

Intrusive List Relation

In this type of relation, a parent has an ordered list of zero or more children. The term intrusive indicates that links are stored directly in the in child objects, rather than in separate link objects (as in STL). Intrusive lists may be singly- or doubly-linked.

Methods Generated for Intrusive List Relation

attach
Example: prnt->attachRoad(chld, prevChld)
Attach chld after prevChld, or, if prevChld is null, as first child.
first
Example: prnt->firstRoad()
Returns first child object, or null if none.
prev
Example: prnt->prevRoad(chld)
Returns child object previous to given one, or null if none.
Supported only if list is doubly linked (Linkage=D).
next
Example: prnt->nextRoad(chld)
Returns next child object after given one, or null if none.
detach
Example: prnt->detachRoad(chld, prevChld)
Detach chld. If prevChld is non-null, it is assumed to be the child previous to chld.
dfirst
Example: prnt->dfirstRoad()
Detach and return first child object. It is assumed that there is one.
num
Example: prnt->numRoad()
Number of child objects attached.
Supported only with Counted option.
owner
Example: chld->ownerTown()
Parent object to which this child object is currently attached, or null if none.
Supported only with GetOwner option.

Macros Generated for Intrusive List Relation

Loop over children
Example: FOR_DbTownRoad(p, c)
Loop over all children c of parent p

Vector Relation

In a vector relation, the parent has any number of child objects, each identified by a non-negative integer. Conventions for sizing of vectors used in relations are described below.

Methods Generated for Vector Relation

attach
Example: prnt->attachRoad(x, chld)
Attach given child object, chld, under index x.
find
Example: prnt->findRoad(x)
Returns child object with index x, or null if none.
detach
Example: prnt->detachRoad(chld)
Detach chld.
Supported only with GetIndex option.
dx
Example: prnt->dxRoad(x)
Detach and return child object attached under index x. It is assumed that there is one.
num
Example: prnt->numRoad()
Number of child objects attached.
Supported only with Counted option.
size
Example: prnt->sizeRoad()
Returns current size of vector.
resize
Example: prnt->resizeRoad(n)
Resizes vector. If n>capacity, capacity is increased to n as well. Assumes no children at locations x>=n.
capacity
Example: prnt->capacityRoad()
Returns current storage capacity of vector.
reserve
Example: prnt->reserveRoad(n)
If n > capacity, raises capacity to n. If n == 0, resizes to 0 and frees storage.
begin
Example: prnt->beginRoad()
A const_iterator to first child object.
end
Example: prnt->endRoad()
A const_iterator after last child object.
owner
Example: chld->ownerTown()
Parent object to which this child object is currently attached, or null if none.
Supported only with GetOwner option.
index
Example: chld->indexTown()
Returns index of this child object in its parent object.
Supported only with GetIndex option.
sort
Example: prnt->sortRoad(compfunc, xBegin, xEnd)
Stable sort children in range xBegin<=xSupported only with Sort option.

Macros Generated for Vector Relation

Loop over children
Example: FOR_DbTownRoad(p, c)
Loop over all children c of parent p
Loop over children with index
Example: FOR_DbTownRoad_Index(p, c, x)
Loop over all children c of parent p, also finding each child's corresponding index x

Map Relation

In a map relation, the parent can have any number of children, each identified by a unique key. No two children of the same parent may have the same key. The keys may be any primitive type (e.g. integers or strings), or object handles.

Methods Generated for Map Relation

attach
Example: prnt->attachRoad(x, chld)
Attach given child object, chld, under key x.
find
Example: prnt->findRoad(x)
Returns child object with key x, or null if none.
detach
Example: prnt->detachRoad(chld)
Detach chld.
Supported only with GetIndex option.
dx
Example: prnt->dxRoad(x)
Detach and return child object attached under key x. It is assumed that there is one.
num
Example: prnt->numRoad()
Number of child objects attached.
Supported only with Counted option.
size
Example: prnt->sizeRoad()
Returns current size of vector.
begin
Example: prnt->beginRoad()
A const_iterator to first [key, child object] pair.
end
Example: prnt->endRoad()
A const_iterator after last [key, child object] pair.
owner
Example: chld->ownerTown()
Parent object to which this child object is currently attached, or null if none.
Supported only with GetOwner option.
index
Example: chld->indexTown()
Returns index of this child object in its parent object.
Supported only with GetIndex option.

Macros Generated for Map Relation

Loop over children
Example: FOR_DbTownRoad(p, c)
Loop over all children c of parent p
Loop over children with index
Example: FOR_DbTownRoad_Index(p, c, x)
Loop over all children c of parent p, also finding each child's corresponding index x

Edge Relation

In an edge relation, a node (the parent) has any number of edges (children). Each edge may have two parent nodes. Edges always have pointers back to their parent nodes. This relation is currently implemented as a vector. Conventions for sizing of the vectors are described below.

Methods Generated for Edge Relation

attach
Example: prnt->attachRoad(x, chld)
Attach given child object, chld, under index x.
find
Example: prnt->findRoad(x)
Returns child object with index x, or null if none.
detach
Example: prnt->detachRoad(chld)
Detach chld.
Supported only with GetIndex option.
dx
Example: prnt->dxRoad(x)
Detach and return child object attached under index x. It is assumed that there is one.
num
Example: prnt->numRoad()
Number of child objects attached.
Supported only with Counted option.
size
Example: prnt->sizeRoad()
Returns current size of vector.
resize
Example: prnt->resizeRoad(n)
Resizes vector. If n>capacity, capacity is increased to n as well. Assumes no children at locations x>=n.
capacity
Example: prnt->capacityRoad()
Returns current storage capacity of vector.
reserve
Example: prnt->reserveRoad(n)
If n > capacity, raises capacity to n. If n == 0, resizes to 0 and frees storage.
begin
Example: prnt->beginRoad()
A const_iterator to first child object.
end
Example: prnt->endRoad()
A const_iterator after last child object.
owner1
Example: chld->owner1Town()
The object to which end 1 of this object is currently attached, or null if none.
owner2
Example: chld->owner2Town()
The object to which end 2 of this object is currently attached, or null if none.
other
Example: chld->otherTown(node)
The object at other end of this object, or null if none.
sort
Example: prnt->sortRoad(compfunc, xBegin, xEnd)
Stable sort children in range xBegin<=xSupported only with Sort option.

Macros Generated for Edge Relation

Loop over children
Example: FOR_DbTownRoad(p, c)
Loop over all children c of parent p
Loop over child edges with other parent node
Example: FOR_DbTownRoad_Other(n, e, nOther)
Loop over all edges e on node n, also finding each edge's other node nOther

Glossary

association
A type of relationship between two (or sometimes more) classes describing a set of links between corresponding objects. Associations may be one-to-one, one-to-many, or many-to-many.
attribute
A named property of an object whose value is some primitive type such as an integer or string, or a vector or map of primitive types.
child
The object to which a link can be traversed, or the object's class. (The many in a one-to-many relationship are generally the children.)
class
A description of a set of attributes, relationships and operations common to a set of objects.
container
A class containing a set of data. Examples are vectors and maps.
intrusive
An intrusive container requires that data be stored in the children, e.g. pointers to next child.
foreign
An class that was generated in a different run of SGEN (usually for another module) or written manually is considered foreign. No header file for the class is generated.
link
A way to find an associated object from a given object. (Generally a pointer from the given object to an associated object).
many-to-many
An association where a parent may have multiple children and a child may have multiple parents. Many-to-many associations must usually be implemented by introducing an intermediate connector class, with both parent and child have a one-to-many association with connectors.)
method
A member function of a class which performs some operation on objects of the class or other related objects.
object
Instance of a class.
one-to-many
An association where a parent may have multiple children, but each child has one parent.
one-to-one
An association where each parent has one child and each child has one parent.
operation
A specification of a method.
owned
A child is owned by a parent if it should be deleted when the parent is destructed. A child may have multiple parents, but can be owned by at most one of them at a time.
parent
the object from which a link can be traversed, or the object's class. (The one in a one-to-many relationship is generally the parent.)
primitive type
A type such as int, float, string, char, void*, etc.
relationship
A joint characteristic of two or more classes. Association and derivation are each types of relationships.
visibility
Scope within which an operation can be used: public, protected or private in the C++ sense.
exclusion
A constraint among a set of relations sharing the same child class such that a child object can participate in at most one of the relations at a time.

(Some of these terms are inspired by UML terminology, which is conveniently described here.)

Customization

Configuration File

The names used for files, classes, methods etc. may be customized via a configuration file written in Python.

FIX add details

Assertions

The handling of failed assertions can be controlled by modifying sgenassert.hpp to your liking.

Implementation Details

Readers interested only in using SGEN may skip this section.

Containers

SGEN uses STL's vector and map containers. The STL containers are wrapped for use by SGEN as follows:

SGEN nameuses SGEN templatedefined inimplemented by STL templatenotes
vector attributeSgenVector<T>sgenvector.hppvector<T>1
map attributeSgenMap<K,D>sgenmap.hppmap<K,D>2
ilist relation (singly-linked)SgenSListI<T*>sgenslisti.hpp--
ilist relation (doubly-linked)SgenListI<T*>sgenlisti.hpp--
vector relationSgenPtrVector<T*>sgenptrvector.hppvector<T*>1
map relationSgenPtrMap<K,D*>sgenptrmap.hppmap<K,void*>2,3

Notes describing functionality added to STL template in Sgen wrapper file:

  1. Provide macro to reduce capacity of vector to 0.
  2. Provide macros to retrieve key and data from pair.
  3. Implement using map<K,void*> to reduce code bloat, including provision of suitable iterator classes.

How Copying Works

A copy constructor to do shallow and/or deep copying is provided on each object the user wishes to copy. In addition, if an object can be deep copied, its base classes and the classes of its "owned" children must provide supporting copy constructors and other methods as well.

The copy constructor may be of either of these forms:

Object(const Object &orig)
Object(const Object &orig, bool deep)
and does the following:
  1. Calls the shallow copy constructor for each relevant base class.
  2. Copies remaining attributes of orig to this (the copy).
  3. Initializes all associations to be empty (no children).
  4. If doing a deep copy:
    1. Creates an instance of SgenCopier to temporarily store the mapping from original to copied objects.
    2. Enters the object to be copied and its copy into the copier.
    3. Calls the copied object's copy1 method to complete the deep copy.

The copy1 method (of the copied object):

void copy1(const Object &orig, SgenCopier &copier)
does the following:
  1. Call the copy1 method of each base class.
  2. For each owned child:
    1. create a new copy using the shallow copy constructor, which may or may not be public.
    2. enter the child and its copy in the copier.
    3. attach the copy of the child to the copy of the owner.
  3. If this is the top level invocation of copy1 (called directly from the deep copy constructor for the root object of the copy), call the original's copy2 method to recursively attach non-owned children (the copy of the child if one was made, or the original if not) to the copy.

The copy2 method (of the original object):

void copy2(SgenCopier &copier) const
does the following:
  1. Gets the copy of the original owner ("this") from the copier.
  2. For each owned child, calls the (original) child's copy2 method to recursively attach non-owned children.
  3. For each non-owned child:
    1. Looks for a copy of the original child in the copier.
    2. If the child was copied, attaches the copy of the child to the copy of the parent.
    3. If the child had no copy, attaches the original child to the copy of the parent.

Note that copy1 and copy2 cannot be virtual functions. This is so if we are copying from a subclass or to a subclass of the copy constructor class we don't attempt to copy stuff we shouldn't.

The copier keeps a table mapping original instances to the corresponding copies. We may encounter pointers to the original object via its actual class. However parents may also have pointed to it via its various base classes. So we need a way to recover the same unique key for the original object no matter what the class of the pointer we have.

We can use a void* as the key, but the way we compute it depends on what assumptions we wish to make. Let p be a pointer to an original instance. The pointer p may be of the instance's actual class or any of its base classes.

  1. Suppose there is no multiple inheritance. Then the key can safely be: reinterpret_cast(p)
  2. Suppose there is some class b that is a base class of all the various classes to which we might encounter pointers. Then the key can be: reinterpret_cast(static_cast(p))
  3. Suppose all the instances are virtual, enabling us to use dynamic_cast to get a pointer to the unique "most-derived" class of any object. Then the key can be: dynamic_cast(p)
For now we use 1. In the future, we may switch to 2.

Bugs and Planned Enhancements

Issues specific to Microsoft Visual C++

Licensing

SGEN consists of two parts: the code generator program and the run-time library. The code generator is made available in executable form by special arrangement. Source code for the run-time library used by the generated code is available under the GNU Lesser General Public License, which permits you to use it in proprietary applications. (Modifications to the library itself must remain open source).

Acknowledgements

The progenitor of SGEN was a program developed at Cadence in the late 1980s by Steve Teig. The idea proved so useful that it spread to a variety of IC CAD and molecular design software companies who each developed their own version, including BioCAD (now Accelrys), QuickLogic, Synplicity, CombiChem (now DeltaGen), Adaptive Silicon, and Roche.

David Harrison developed the generator used by the author at BioCAD. Feedback and suggestions based on precursors of SGEN were provided by: Doug Barnum, Brian Goldman, Andrew Smellie, and Steve Teig of CombiChem; Mukesh Lulla of TeamF1; and Geoffrey Ellis and Heng-Yi Chao of Adaptive Silicon.