DataTime Processing Framework
Project Requirements, Phase 1: Core
Processing Framework (DTPF) is a C++ framework facilitating the
creation of time-based data processing systems. While applicable to a
wide range of systems (audio/visual processing, sensory data
acquisition, digital control systems, etc.), the immediate intent is to
support the creation of computational models of sensory processing. For
example, models of audio-visual sensory integration in developing human
infants. Such models tend to be modular, and studies involving them
typically examine a set of similar models with variations or extensions
to a basic form, deal with large data rates (e.g., audio/visual data)
and use computationally expensive algorithms that can benefit from
hardware parallelization. DTPF has the primary goal of making it easier
for various programmers
and researchers to create sensory models in a modular manner. The first
phase will involve implementing the core framework for off-line (as
opposed to real-time) processing. This phase will also not emphasize
graphical user interface (GUI) aspects, though it will provide features
for later GUI integration.
The initial purpose of this project is to provide a
well-engineered foundation for creating computational models of
integration and attention in human infants. This specifically includes
Sensory Models of Attention (ESMA) which are composed of distinct
performing specific functions, and share a general 'pipe and filter'
with many other computational models which operate on sensory data. As
is the nature of these models to be highly modularized there is much
for reuse of common modules in different experiments and models.
and reuse also directly facilitates the style in which the models are
as studies involving them typically examine a set of similar models
variations or extensions to a basic form. These models also work with
rates of data, performing more complex processing than traditional
applications (e.g., audio-visual mutual information calculation,
& Movellan, 2000).
for non real-time situations, the computationally expensive algorithms
can benefit from hardware parallelization. Although the
here is on sensory models that can run without real-time constraints,
are beginning to work with hardware devices as well (e.g., robotic
Introducing soft real-time requirements (soft real-time
such as video players can tolerate some indeterminacy in timing, hard
systems such as aircraft control in general do not have that luxury)
enhances the importance of being able to harness hardware and software
In previous work we have been developing customized
software programs (e.g., SoundStream
for applications of SenseStream see: Prince & Hollich, in press;
Prince, et.al., 2004).
While this can reduce the perceived initial amount of time and effort
to get a particular program up and running, it leads to more difficulty
in extending program functionality, maintenance problems, lower code
and many other software engineering evils. In short, while the
approach can be seen as providing short term gains, any real gains
occur at the expense
of long term flexibility. Many of these issues were encountered in the
development and modification of SenseStream. SenseStream dealt with
many of the same processing issues as SoundStream did, but due to the
way these programs were designed and implemented, code from the earlier
SoundStream project found no reuse in SenseStream. The customized
nature of SenseStream has also made subsequent modifications more
difficult. One such addition to the SenseStream program was calculation
of Mel Frequency Cepstral Coefficients (MFCC) from audio data which
involved modifications to the user interface, configuration, audio
processing and mutual information calculation program code. Another
modification that has proved more intrusive is the ongoing integration
of the SoDiBot data acquisition. This has involved circumventing the
original audio and video input (which read data from a file), as well
as modification of SenseStream and the SoDiBot controlling software in
order to communicate data and perform synchronization. A more detailed
discussion of the structure of SenseStream, changes that have been made
to it, and how the DTPF will be used is available here
The modifications to SenseStream could have been
made considerably easier if a framework that enabled and encouraged
modularization and reuse had been used for the SenseStream program.
Development could have been
more focused on the processing problems addressed rather than having do
deal with more mundane issues of configuration, data communication and
synchronization. The program code itself could have also been more
applicable to future projects such as the sensory models mentioned
In addition to these software engineering issues, as
our work has progressed we have begun
collaborating with a number of psychologists who use computers running
Macintosh OSX almost exclusively. While our group has some access to
OSX computers there are a larger number of Solaris and Linux computers
available at our university. A solution supporting OSX while allowing
for some level of program code portability between different platforms
is desirable. However, the focus of this initial stage is on
flexibility; portability is a secondary objective.
These factors motivate the creation of a more
generalized framework for modular processing of time-based data,
capable of exploiting parallelism by concurrent execution of modules
across multiple processors and multiple computers, while allowing
multiple operating system platforms to be exploited. Such a framework,
well designed, could also be applied to the more traditional multimedia
domain or general problems involving processing of time-based data.
The stakeholders involved with DTPF can roughly be
categorized as follows:
- End-users: The envisioned users of
DTPF are psychologists or other researchers conducting simulations and
experiments with sensory models. The non-programmer's interaction with
DTPF will of course be through end-user software. End-users are
considered stakeholders here to the extent that the framework must
support applications which meet their needs.
- Programmers: Programmers implementing model components or
will interact with the framework through its application programming
interface (API) as well as end-user software.
- Implementors: Framework developers (who could be
considered application developers or end-users as well) will be
involved with implementing internals of the framework in addition to
making use of the API and end-user software.
and knowledge of these stakeholders can vary considerably, from student
programmers early in their undergraduate careers to professional
researchers with extensive to non-existent programming background.
Experience with different OS platforms and software packages can be
expected to show similar variation.
Several risk factors present themselves concerning
the creation of the DTPF. Planning and implementation of the first
phase discussed here entails considerable effort, potentially on the
order of one person-year. Limiting the scope of this first phase and
employing appropriate software engineering principles could reduce the
effort required. There is also the risk that the implemented DTPF will
not adequately meet our stakeholder's needs: programmers might find it
difficult to develop models using the API provided; end-users might
find the resulting end-user software too complicated or incomplete for
their purposes. This could be due, among other reasons, to missing,
incomplete or incorrect functionality that hinders the development of
sensory models or end-user software with the DTPF.
risks, the potential long-term benefits of creating the DTPF are still
attractive. Exposure to these risks can be reduced by keeping them in
mind through the stages of planning and implementation.
While this project intends to create a new
framework, considering existing software is important. Existent
software can reveal successful design approaches and desirable
functionality, as well as drawbacks and potential pitfalls. We are also
interested in software toolkits that may be useful in creating the DTPF
and facilitating future projects. The discussion of this software is
III. System Environment
The first phase of
DTPF shall run on single desktop/laptop computers running Macintosh OS
X with necessary
support software installed.
IV. Functional Requirements for
presented here are quite dependent on the architectural model (an
gives a top-level view of the organization of a system, see Figure 1),
some cases even imply specific implementation. While this would be
inappropriate for some systems, specifying requirements of a framework
requires some reference to architectural structure.
Avoiding this would make clear specification of requirements much more
difficult. In short, for a framework we feel it is valid to require a
certain architectural model.
|Figure 1: A top-level view of
architectural model. Note that the connections shown between nodes are
illustrative of a hypothetical situation. In general any node can be
connected to any other.
The architectural model adopted is roughly that of
the BeOS MediaKit. Processing is organized into nodes
be connected together to send processing data from one to another.
Processing data is stored in buffers
data is transfered by sending references to buffers from a producer
node to a consumer node. A roster
resources, and controls certain shared resources. A client interface
program code (node implementations or client applications for example)
communicate with the roster through service requests. A plug-in loader
is responsible for loading plug-in nodes that can be specified
dynamically at run-time.
Many of the concepts presented here are related in
one way or another. As a result of this the current document contains
some redundancy between sections (particularly section IV.G discussing
roster). This will be resolved as this document progresses.
|Data processing is divided into
with interfaces for configuration, control and data I/O.
- Nodes are implemented as subclasses of a framework class,
and may be statically linked in application code or dynamically linked
from shared libraries.
- To make a node available to the framework it is registered
with the roster using a
framework provided method. This involves communicating information to
the roster that is needed to identify and communicate with the node.
(See § IV.G)
- To make a registered node unavailable to the
framework it is unregistered using a framework provided method.
- A registered node can be in stopped, starting,
started and stopping
states. A node is transitioned into the starting and
stopping states by slot
are called by the framework. When these slot methods complete the node
is considered to have transitioned to its next state.
2: Registered node states.
- In the stopped state a node may be requested to perform
all query and configuration operations it supports.
- In the started state a node is assumed to be actively
processing. Query and configuration
operations may fail at the discretion of the node implementation.
- The starting and stopping states are transitional, the
framework prevents query or configuration operations from being
performed when a node is in either of these states.
- Methods are provided to query and update a node's
configuration parameters. (See § IV.F)
- Methods are provided for a node to publish its data input
and output capabilities. These capabilities
represented by outputs and inputs. A node
informs the roster of changes to its published inputs or outputs.
- A protocol is provided for connecting nodes which produce
output data to nodes which accept input data. A connection
establishes an agreement that an output node will produce a particular
type of data, and
that an input node will consume that data. (See § IV.C)
|Node execution is event driven.
- Nodes are controlled by events corresponding to the various
actions that can be performed on a node.
- A framework provided event looper is
responsible for receiving and dispatching events by calling framework
defined hook and slot methods.
||For communication of processing
data (audio samples, video frames, etc.), a node which outputs data can
connected to a node which accepts data input.
- To establish a connection an input and an output for a
desired data format type must be identified. (See §
IV.A & IV.E)
- Client code specifies details of the desired data format.
- Once the input, output and desired format have been
the client makes a request to establish the connection. (See §
If either of the nodes involved is not in the stopped state an error
condition is raised, terminating the connection process.
- Format information and output
identification is sent to the output (producer) node. The output node
can specify preferred values for
unspecified format attribute values. At this point the output node
can terminate the connection process for any implementation dependent
reason by raising an error condition.
- If the output node did not terminate the connection
process, format information from the output node and
input identification is sent to the input (consumer) node. The input
node can inspect the format configured
by the output node. At this point the input node can terminate the
connection process for any implementation dependent reason by raising
an error condition.
- Provided neither node raises an error condition the
connection process is considered complete after the input node has been
consulted. Both nodes should be ready to participate in their
respective roles as producer and consumer.
- A connection must be associated with a buffer group before
the nodes involved can be successfully started.
- Two connected nodes can be disconnected provided both nodes
are in the stopped state.
|Data passed between nodes is
stored in buffers, organized into buffer
- The framework provides operations for buffer group
allocation, deallocation, lookup of existing buffer groups, and
allocation/deallocation of buffer and buffer group descriptor objects.
- Program code can request constraints on the memory
addressing and alignment used for buffer data when a buffer group is
constraints include absolute addressing, aligned addressing (i.e.,
data starting on a page boundary), and non-paged buffer data.
- Each connection between nodes is associated with a buffer
group that is used for the data passed over the connection.
- A buffer group can be associated with more than one
connection between nodes provided all connections involve the same data
- A buffer group has an associated data format (See §
IV.E) that is common to all buffers in the buffer group. This format is
the same as the format of any connections the buffer group is
- Access rights to a buffer are acquired from
the owning buffer group through the framework before use. These
are transferred from producer to consumer node when a buffer is sent
a connection, and are released when the buffer is no longer
This is needed to ensure process and thread synchronization. Exclusive
modification rights are a convention to be adhered to by node
implementations to ensure data integrity.
- A buffer is sent to a consumer node using a framework
provided method which results in the consumer node being informed that
specific buffer is
descriptions are handled by the framework in a manner independent of
specific format implementations.
- Specific format types have a unique name composed from the
'supertype' and 'subtype' (e.g. 'audio/raw', 'video/packed', 'raw',
etc.). Project level organization (i.e. 'esma/audio-features') of
formats is also possible. The 'supertype/subtype' style is not a
requirement on the naming, it is used here for illustration.
- The framework provides default formats for 'audio/sampled',
'video/packed', 'video/planar' and 'raw' types. Wildcard or 'don't
values are provided for the specific format attributes where
||Sampling rate, samples
per buffer, sample data
type, quantization type (linear, µ-law, etc.), channel count and
|Describes video data
into arrays of pixels containing all color components.
image dimensions, pixels per image row, storage size per image row and
|Describes video data
into contiguous 'planes' of data values with one plane for each color
pixels per plane row per plane, storage size per plane row per plane,
and video frame rate.
|Describes data of
||Data type (
real32, etc.) and
number of elements per buffer.
|All data formats
|Frame rate (buffers
unit time) of the data.
|Nodes can publish a list of
parameters, where each parameter is described by a parameter
groups. This allows uniform configuration by other program
and saving and restoration of configurations.
- Parameter groups can contain any number of parameter models
and other parameter groups. A there is one root parameter group for a
- Parameter models include a parameter name, value, textual
label, and preferred user interface (UI) controller type to allow
of configuration UI's.
- Types of parameter models include on/off, finite
file/directory, and single/multi-line text. The following table gives
general descriptions of these models and GUI components typically
associates with them. The GUI components are listed here to help
illustrate future uses of these models.
|Yes or no parameter
|List of choices
allowing multiple selection.
selection combo boxes, groups of checkboxes.
|List of choices
allowing only one selection.
|Radio buttons, single
selection combo boxes.
|Parameter values lying
between two values, inclusive.
|Parameters that refer
to file system entries.
|File selection dialogs.
|Single or multiple
line text for comments or display of information.
|Text input fields,
- Parameter models indicate whether a parameter can be
modified during processing (corresponding to a node in the started
setup only (corresponding to a node in the stopped state), or not at
(provided for informational purposes).
- Parameter models are configured to trigger an automatic
parameter update upon modification, or to wait for program code to
invoke an update. This configuration can be applied to parameter
affecting all parameter models and parameter groups contained within.
- Changes to a node's parameter values are propagated to the
node by the
framework. A node notifies the framework of changes the node makes to
its own parameter values.
|A roster manages framework
resources, and acts as broker for framework
- A list of dormant nodes is
- A list of active nodes that are currently registered with
the roster is maintained. Each registered active node is assigned a
unique identifier by the roster. For each node lists of published
inputs and outputs are maintained by the roster.
- A list of active connections between nodes is maintained.
The connection information identifies the sending and receiving nodes,
and the data format used.
- A list of buffer groups is maintained. Each buffer group is
assigned a unique identifier. The roster tracks which connection(s) a
group is associated with.
|A plug-in loader handles
loading of dormant plug-in nodes.
- The plugin-loader handles requests to load a dormant
or to unload a previously loaded node.
- Multiple instances of the same plug-in node can be loaded,
where that is inappropriate for a node (e.g., a node requiring
exclusive access to a hardware device). Node implementations
enforce this constraint.
|DTPF provides a
client interface to the roster. The following operations are
- Requesting a list of a node's inputs and outputs that
are not yet connected, currently connected inputs and outputs, or all
inputs. These operations can be constrained to inputs and outputs of a
specified format type.
- Establishing a new connection between an output and an
- Breaking [ToDo: 'breaking' sounds bad, investigate
alternative terminology...] a previously established connection.
- Registering a node with the roster.
- Unregistering a node.
- Requesting a list of nodes that have been registered
with the roster. This can be constrained to nodes that support
specified input and output format types.
- Requesting a list of dormant plug-in nodes that can be
the plug-in loader. This can be limited to nodes that support specified
input and output format types.
- Loading an instance of a dormant plug-in node.
- Releasing a plug-in node to allow the framework to
deallocate the node.
- Requesting an active node's top-level parameter group.
- Applying changes to a node's parameters.
- Prerolling, starting and stopping an active node.
- Registering a listener to receive notification when:
Listeners can be registered to receive all notifications or a subset of
notifications. The notifications delivered to a listener can also be
- A node is registered or unregistered.
- A new connection is established or an existing
connection is broken.
- Changes are made to a node's parameters.
- A node transitions to the prerolling, prerolled,
starting, started, stopping or stopped states.
- Unregistering a listener so it no longer receives
|A driver program allows
integration testing of the framework roster interface and a means to
setup initial 'models'.
V. Non-functional Requirements for
the Core Framework
|To facilitate use and reuse for diverse models,
by diverse users, the framework design and code is
documented' for the programmer and user.
- The project development documentation will be maintained
throughout the project. These documents include a project plan,
project requirements (this document),
design documents and diagrams, and test procedures and plans. [ToDo:
Specify structure/content of these documents, link/refs to common
- User documentation includes instructions on installing DTPF
and needed support software, enumeration of supported and tested
platforms and software configurations, and instructions detailing
operation of framework applications.
- For model and application programmers the application
programming interface (API) is thoroughly and consistently
documented. This documentation includes the intent and proper use of
classes, methods and members,
descriptions of method pre-conditions, arguments, return values and
General discussion of important concepts, possible
and shortcomings of the software are also included. Design and other
diagrams are to be included as appropriate.
- Documentation for project developers similar in detail to
API documentation is provided. This documentation covers the internals
of the framework implementation.
- A source-level documentation system (such as Doxygen, see also: Williams, 2004)
used to generate programming interface documentation.
|The implementation language is
ISO standardized C++. [ToDo: Make sure this is correct way to refer to
'standard' C++, link/ref]
|DTPF is released as open-source
software under the Academic
Free License, version 2.1 (AFL).
- All core framework libraries and applications are
compatible with the AFL license.
- All software which the core framework depends upon is
compatible with this license. ACE, YARP and MUSCLE satisfy this
- Tools used are not limited to platform. This includes tools
used for planning, documentation, development and other tasks.
Commercial or proprietary products are avoided where possible.
- Client applications and node implementations that are not
part of the core framework, and are distributed separately from the
framework, are not directly constrained by this license.
- DTPF is a SourceForge
project. SourceForge's services are used for version control, website
hosting, file releases and applicable aspects of project management.
- The first phase of DTPF is implemented on
Macintosh OS X.
- GNU/Linux is a secondary platform for this phase of DTPF.
Primary development is focused on OS X. In the future GNU/Linux may be
supported, but for this
phase there is no concern to validate builds on GNU/Linux.
- Platform neutral coding practices are followed (i.e., a
pointer is never cast to an integer, no assumptions are made about the
size of '
long', no assumptions are made regarding byte
- For the first phase of implementation processing
is not real-time. Any computer capable of running OS X is capable of
the framework. The nodes themselves are the major resources sinks and
requirements will impose stricter limitations.
|Allowances are made for future
soft real-time processing and multiple machine distribution.
|The core framework consumes
- When not involved in handling requests for services the
roster, plug-in loader, roster client and event loopers consume no more
than 1% of processor time on a 500 MHz Macintosh G4 system running OS
this will be 0% usage.
- The roster resources use no more than 1 KB of memory per
entry. Each registered node, dormant plug-in node, input, output,
format, connection, buffer
group description, etc. is considered a resource here. Buffer groups
themselves involve at least as much memory for the buffers they contain.
Academic Free License:
- Home Page: http://www.doxygen.org
- Williams, A. (2004). Examining Doxygen. Dr. Dobb's Journal, October 2004
(pp. 52-56). CMP Media LLC, San Francisco.
Mutual Information Algorithm:
- Hershey, J., & Movellan, J. (2000).
Audio-vision: Using audio-visual synchrony to locate sounds. In S. A.
Solla, T. K. Leen, & K. R. Muller (eds.), Advances in Neural
Information Processing Systems 12 (pp. 813-819). Cambridge, MA: MIT
- Home Page: http://www.cprince.com/PubRes/SenseStream
- Prince, C. G. & Hollich, G. J. (in press). Synching infants
models: A perceptual-level model of infant synchrony detection. The Journal
of Cognitive Systems Research, Special Issue on Epigenetic
Robotics. Internet: http://dx.doi.org/10.1016/j.cogsys.2004.11.006
- Prince, C. G., Hollich, G. J., Helder, N. A., Mislivec, E. J.,
A., Salunke, S., & Memon, N. (2004). Taking synchrony seriously: A
perceptual-level model of infant synchrony detection. Paper presented
at The Fourth International Workshop on Epigenetic
Robotics: Modeling Cognitive Development in Robotic Systems, held
at Genoa, Italy, August 25-27, 2004. (pp. 89-96). http://www.lucs.lu.se/ftp/pub/LUCS_Studies/LUCS117/prince.pdf
- hook method
- Polymorphic method defining an interface for which subclasses may
optionally provide an implementation.
- A software component that can be dynamically loaded at run-time.
- slot method
- Polymorphic method defining an interface that must be implemented
© 2005 Eric J.
Mislivec, Last Modified: 12 July, 2005