DTPF: DataTime Processing Framework

DataTime Processing Framework
Project Requirements, Phase 1: Core Framework

Home

Users

The DataTime Processing Framework (DTPF) is a C++ framework facilitating the creation of time-based data processing systems. While applicable to a wide range of systems (audio/visual processing, sensory data acquisition, digital control systems, etc.), the immediate intent is to support the creation of computational models of sensory processing. For example, models of audio-visual sensory integration in developing human infants. Such models tend to be modular, and studies involving them typically examine a set of similar models with variations or extensions to a basic form, deal with large data rates (e.g., audio/visual data) and use computationally expensive algorithms that can benefit from hardware parallelization. DTPF has the primary goal of making it easier for various programmers and researchers to create sensory models in a modular manner. The first phase will involve implementing the core framework for off-line (as opposed to real-time) processing. This phase will also not emphasize graphical user interface (GUI) aspects, though it will provide features for later GUI integration.

Motivation
    The initial purpose of this project is to provide a well-engineered foundation for creating computational models of audio-visual sensory integration and attention in human infants. This specifically includes Epigentic Sensory Models of Attention (ESMA) which are composed of distinct components performing specific functions, and share a general 'pipe and filter' structure with many other computational models which operate on sensory data. As it is the nature of these models to be highly modularized there is much potential for reuse of common modules in different experiments and models. Modularization and reuse also directly facilitates the style in which the models are used as studies involving them typically examine a set of similar models with variations or extensions to a basic form. These models also work with high rates of data, performing more complex processing than traditional multimedia applications (e.g., audio-visual mutual information calculation, Hershey & Movellan, 2000). Even for non real-time situations, the computationally expensive algorithms used can benefit from hardware parallelization. Although the focus here is on sensory models that can run without real-time constraints, we are beginning to work with hardware devices as well (e.g., robotic pan-tilt cameras, SoDiBot). Introducing soft real-time requirements (soft real-time applications such as video players can tolerate some indeterminacy in timing, hard real-time systems such as aircraft control in general do not have that luxury) further enhances the importance of being able to harness hardware and software parallelism.
    In previous work we have been developing customized software programs (e.g., SoundStream, SenseStream; for applications of SenseStream see: Prince & Hollich, in press; Prince, et.al., 2004). While this can reduce the perceived initial amount of time and effort required to get a particular program up and running, it leads to more difficulty in extending program functionality, maintenance problems, lower code reusability and many other software engineering evils. In short, while the 'one-shot' approach can be seen as providing short term gains, any real gains occur at the expense of long term flexibility. Many of these issues were encountered in the development and modification of SenseStream. SenseStream dealt with many of the same processing issues as SoundStream did, but due to the way these programs were designed and implemented, code from the earlier SoundStream project found no reuse in SenseStream. The customized nature of SenseStream has also made subsequent modifications more difficult. One such addition to the SenseStream program was calculation of Mel Frequency Cepstral Coefficients (MFCC) from audio data which involved modifications to the user interface, configuration, audio processing and mutual information calculation program code. Another modification that has proved more intrusive is the ongoing integration of the SoDiBot data acquisition. This has involved circumventing the original audio and video input (which read data from a file), as well as modification of SenseStream and the SoDiBot controlling software in order to communicate data and perform synchronization. A more detailed discussion of the structure of SenseStream, changes that have been made to it, and how the DTPF will be used is available here.
    The modifications to SenseStream could have been made considerably easier if a framework that enabled and encouraged modularization and reuse had been used for the SenseStream program. Development could have been more focused on the processing problems addressed rather than having do deal with more mundane issues of configuration, data communication and synchronization. The program code itself could have also been more directly applicable to future projects such as the sensory models mentioned above.
    In addition to these software engineering issues, as our work has progressed we have begun collaborating with a number of psychologists who use computers running Macintosh OSX almost exclusively. While our group has some access to OSX computers there are a larger number of Solaris and Linux computers available at our university. A solution supporting OSX while allowing for some level of program code portability between different platforms is desirable. However, the focus of this initial stage is on flexibility; portability is a secondary objective.
    These factors motivate the creation of a more generalized framework for modular processing of time-based data, capable of exploiting parallelism by concurrent execution of modules across multiple processors and multiple computers, while allowing multiple operating system platforms to be exploited. Such a framework, if well designed, could also be applied to the more traditional multimedia domain or general problems involving processing of time-based data.

Stakeholders
    The stakeholders involved with DTPF can roughly be categorized as follows:

End-users: The envisioned users of DTPF are psychologists or other researchers conducting simulations and experiments with sensory models. The non-programmer's interaction with DTPF will of course be through end-user software. End-users are considered stakeholders here to the extent that the framework must support applications which meet their needs.
Programmers: Programmers implementing model components or applications will interact with the framework through its application programming interface (API) as well as end-user software.
Implementors: Framework developers (who could be considered application developers or end-users as well) will be involved with implementing internals of the framework in addition to making use of the API and end-user software.

The skills and knowledge of these stakeholders can vary considerably, from student programmers early in their undergraduate careers to professional researchers with extensive to non-existent programming background. Experience with different OS platforms and software packages can be expected to show similar variation.

Risks
    Several risk factors present themselves concerning the creation of the DTPF. Planning and implementation of the first phase discussed here entails considerable effort, potentially on the order of one person-year. Limiting the scope of this first phase and employing appropriate software engineering principles could reduce the effort required. There is also the risk that the implemented DTPF will not adequately meet our stakeholder's needs: programmers might find it difficult to develop models using the API provided; end-users might find the resulting end-user software too complicated or incomplete for their purposes. This could be due, among other reasons, to missing, incomplete or incorrect functionality that hinders the development of sensory models or end-user software with the DTPF.
    Considering these risks, the potential long-term benefits of creating the DTPF are still attractive. Exposure to these risks can be reduced by keeping them in mind through the stages of planning and implementation.

Existing Software
    While this project intends to create a new framework, considering existing software is important. Existent software can reveal successful design approaches and desirable functionality, as well as drawbacks and potential pitfalls. We are also interested in software toolkits that may be useful in creating the DTPF and facilitating future projects. The discussion of this software is available here.

The first phase of DTPF shall run on single desktop/laptop computers running Macintosh OS X with necessary support software installed.

The requirements presented here are quite dependent on the architectural model (an architectural model gives a top-level view of the organization of a system, see Figure 1), and in some cases even imply specific implementation. While this would be inappropriate for some systems, specifying requirements of a framework requires some reference to architectural structure. Avoiding this would make clear specification of requirements much more difficult. In short, for a framework we feel it is valid to require a certain architectural model.

Figure 1: A top-level view of the DTPF architectural model. Note that the connections shown between nodes are illustrative of a hypothetical situation. In general any node can be connected to any other.

The architectural model adopted is roughly that of the BeOS MediaKit. Processing is organized into nodes which can be connected together to send processing data from one to another. Processing data is stored in buffers and this data is transfered by sending references to buffers from a producer node to a consumer node. A roster maintains records of various resources, and controls certain shared resources. A client interface allows program code (node implementations or client applications for example) to communicate with the roster through service requests. A plug-in loader is responsible for loading plug-in nodes that can be specified dynamically at run-time.

Many of the concepts presented here are related in one way or another. As a result of this the current document contains some redundancy between sections (particularly section IV.G discussing the roster). This will be resolved as this document progresses.

Academic Free License:

Open Source Initiative Page: http://opensource.org/licenses/afl-2.1.php

Home Page: http://www.cs.wustl.edu/~schmidt/ACE.html
ACE Overview: http://www.cs.wustl.edu/~schmidt/ACE-overview.html
Nagel, W. (2004). Real-Time Systems & RT CORBA. Dr. Dobb's Journal, December 2004 (pp. 70-75). CMP Media LLC, San Francisco.

MediaKit Developer Documentation: http://datatime.sourceforge.net/Be%20Book/The%20Media%20Kit/index.html
The Be Book, General BeOS Developer Documentation: http://datatime.sourceforge.net/Be%20Book/index.html
Cortex Home Page: http://cortex.sourceforge.net

Doxygen:

Home Page: http://www.doxygen.org
Williams, A. (2004). Examining Doxygen. Dr. Dobb's Journal, October 2004 (pp. 52-56). CMP Media LLC, San Francisco.

IKAROS:

JMF API Home Page: http://java.sun.com/products/java-media/jmf
Wellings, A. J. (2004). Concurrent and Real-time Programming in Java. John Wiley, Hoboken, NJ.

Mutual Information Algorithm:

Hershey, J., & Movellan, J. (2000). Audio-vision: Using audio-visual synchrony to locate sounds. In S. A. Solla, T. K. Leen, & K. R. Muller (eds.), Advances in Neural Information Processing Systems 12 (pp. 813-819). Cambridge, MA: MIT Press.

Kanehiro, F., Hirukawa, H., & Kajita, S. (2004). OpenHRP: Open Architecture Humanoid Robotics. International Journal of Robotics Research, 23, 155-165. Internet: http://dx.doi.org/10.1177/0278364904041324

SenseStream:

Home Page: http://www.cprince.com/PubRes/SenseStream
Prince, C. G. & Hollich, G. J. (in press). Synching infants with models: A perceptual-level model of infant synchrony detection. The Journal of Cognitive Systems Research, Special Issue on Epigenetic Robotics. Internet: http://dx.doi.org/10.1016/j.cogsys.2004.11.006
Prince, C. G., Hollich, G. J., Helder, N. A., Mislivec, E. J., Reddy, A., Salunke, S., & Memon, N. (2004). Taking synchrony seriously: A perceptual-level model of infant synchrony detection. Paper presented at The Fourth International Workshop on Epigenetic Robotics: Modeling Cognitive Development in Robotic Systems, held at Genoa, Italy, August 25-27, 2004. (pp. 89-96). http://www.lucs.lu.se/ftp/pub/LUCS_Studies/LUCS117/prince.pdf

SoDiBot:

SourceForge:

Home Page: http://www.sourceforge.net
DTPF Project Page: https://sourceforge.net/projects/datatime

YARP:

VII. Glossary

hook method: Polymorphic method defining an interface for which subclasses may optionally provide an implementation.

plug-in: A software component that can be dynamically loaded at run-time.

slot method: Polymorphic method defining an interface that must be implemented by subclasses.

I.	Problem
II.	Background
III.	System Environment
IV.	Functional Requirements
V.	Non-functional Requirements
VI.	References
VII.	Glossary

V.A:	To facilitate use and reuse for diverse models, by diverse users, the framework design and code is 'well documented' for the programmer and user. The project development documentation will be maintained throughout the project. These documents include a project plan, project requirements (this document), design documents and diagrams, and test procedures and plans. [ToDo: Specify structure/content of these documents, link/refs to common formats, recommendations.] User documentation includes instructions on installing DTPF and needed support software, enumeration of supported and tested platforms and software configurations, and instructions detailing operation of framework applications. For model and application programmers the application programming interface (API) is thoroughly and consistently documented. This documentation includes the intent and proper use of classes, methods and members, and descriptions of method pre-conditions, arguments, return values and post-conditions. General discussion of important concepts, possible difficulties/problems and shortcomings of the software are also included. Design and other diagrams are to be included as appropriate. Documentation for project developers similar in detail to the API documentation is provided. This documentation covers the internals of the framework implementation. A source-level documentation system (such as Doxygen, see also: Williams, 2004) is used to generate programming interface documentation.

V.B:	The implementation language is ISO standardized C++. [ToDo: Make sure this is correct way to refer to 'standard' C++, link/ref]

V.C:	DTPF is released as open-source software under the Academic Free License, version 2.1 (AFL). All core framework libraries and applications are compatible with the AFL license. All software which the core framework depends upon is compatible with this license. ACE, YARP and MUSCLE satisfy this requirement. Tools used are not limited to platform. This includes tools used for planning, documentation, development and other tasks. Commercial or proprietary products are avoided where possible. Client applications and node implementations that are not part of the core framework, and are distributed separately from the framework, are not directly constrained by this license. DTPF is a SourceForge project. SourceForge's services are used for version control, website hosting, file releases and applicable aspects of project management.

V.D:	Platform details. The first phase of DTPF is implemented on Macintosh OS X. GNU/Linux is a secondary platform for this phase of DTPF. Primary development is focused on OS X. In the future GNU/Linux may be supported, but for this phase there is no concern to validate builds on GNU/Linux. Platform neutral coding practices are followed (i.e., a pointer is never cast to an integer, no assumptions are made about the size of '`long`', no assumptions are made regarding byte ordering, etc.). For the first phase of implementation processing is not real-time. Any computer capable of running OS X is capable of running the framework. The nodes themselves are the major resources sinks and their requirements will impose stricter limitations.

V.E:	Allowances are made for future soft real-time processing and multiple machine distribution.

V.F:	The core framework consumes minimal resources. When not involved in handling requests for services the roster, plug-in loader, roster client and event loopers consume no more than 1% of processor time on a 500 MHz Macintosh G4 system running OS X. Ideally this will be 0% usage. The roster resources use no more than 1 KB of memory per entry. Each registered node, dormant plug-in node, input, output, format, connection, buffer group description, etc. is considered a resource here. Buffer groups themselves involve at least as much memory for the buffers they contain.

V.G: