Representations for Multi-modal Human-Computer Interaction

AAAI '98 Workshop, July 26-27, 1998, Madison, Wisconsin
Representations for processing human communication have, mainly, been concerned with single modalities. Further advances, however, may require taking advantage of the fact that most human communication takes place in more than one modality at the same time.

A core problem in multi-modal human-computer interaction is how the information conveyed via multiple modalities is funneled into and out of a single underlying representation of meaning to be communicated. On the output side, this is the information-to-media allocation problem; on the input side, this is the cross-media information fusion problem.

The aims of this workshop are:

to assess the state of computer representations for understanding human communication in multiple modalities or communicating with humans with multiple media,
to encourage collaborative research in developing and using representations that facilitate multi-modal interaction.
Relevant modalities include visual, auditory, olfactory, haptic (touch), kinesthetic (motion/position-sensing), speech, gesture, facial expression, myoelectric signals, and neural inputs. Relevant media include video, text, handwriting, graphics, images, and animation. Proper communication with these modalities and media may be contingent on an underlying set of intentions, such as being informative, deceptive, persuasive, entertaining, affective, social, and so forth.

Topics of interest include (but are not limited to):

Recognizing the multidisciplinary nature of multi-modal communication, the intent of the workshop is to be as inclusive as possible. However, papers should address the topics of the workshop directly. For example, a submission could address these topics by showing in detail how their system processes some multi-modal interaction. Such a detailed description might describe the background information used, how that information is represented and utilized, how the system processes the knowledge to respond appropriately, and how the information is processed into (multi-modal) answers. Reports on representations used in projects whose purpose is to simulate human multi-modal interaction, or projects whose purpose is to provide multi-modal interfaces to databases or planners, are also appropriate. Position papers are also solicited; for example, such a paper might present an analysis of the types of communicative intent, how they can be classified and sub-classified, and how they are best represented, for use in multi-modal systems.

Workshop Format

The workshop format will be a mix of paper presentations, invited talks, panels, and break-out sessions. Paper sessions will be organized around the (above listed) workshop topics. Panel discussions or break-out sessions would follow paper sessions. Short stand-alone (e.g., laptop-based) demonstrations are welcome. If you have a demonstration that requires more significant infrastructure or time, potential participants are encouraged to submit a demonstration proposal to the Intelligent Systems Demonstrations program at AAAI '98.

Gary W. Strong, Program Director of Interactive Systems at the National Science Foundation, has agreed to give an invited talk.


Attendance is expected to be limited to 50-60 participants. Preference will be given to authors whose papers have been accepted at the workshop.

