CONTENTS
Editorial = xvii
1 Representation and annotation of dialogue = 1
1.1 Introduction = 1
1.1.1 Goals = 1
1.1.2 What is meant by 'Integrated Resources'? = 2
1.1.3 Limitations = 3
1.2 A preliminary classification of dialogue corpora = 5
1.2.1 Dialogue acts = 6
1.2.2 Towards a dialogue typology = 6
1.3 General coding issues = 11
1.4 Orthography = 12
1.4.1 Orthographic representation = 12
1.4.2 Recommendations = 24
1.5 Morphosyntax = 26
1.5.1 Morphosyntactic (POS) annotation = 26
1.5.2 Recommendations = 32
1.6 Syntax = 32
1.6.1 Syntactic annotation = 32
1.6.2 Recommendations = 39
1.7 Prosody = 39
1.7.1 Prosodic annotation = 39
1.7.2 Recommendations = 53
1.8 Pragmatics = 54
1.8.1 Pragmatic annotation : functional dialogue annotation = 54
1.8.2 Recommendations = 66
Appendix A : TEI paralinguistic features = 67
Appendix B : TEI P3 DTD : base tag set for transcribed speech = 68
Appendix C : Afewrelevantweblinks = 70
Appendix D : Specimen Annotated Dialogue = 70
D.1 : Orthographic Transcription = 71
D.2 : Morphosyntactic annotation = 72
D.3 : Syntactic annotation = 73
D.4 : Prosodic Annotation = 75
D.5 : Pragmatic (Dialogue Act) Annotation = 84
D.6 : Combined Multi-level Annotation = 87
Appendix E : Morphosyntactic annotation of corpora = 89
Appendix E1 : English tagset = 89
Appendix E2 : Italian DMI codes = 95
2 Audio-visual and multimodal speech-based systems = 102
2.1 Introduction = 102
2.1.1 Terminology = 103
2.1.2 Chapter outline = 106
2.1.3 Benefits of multimodal systems = 106
2.1.4 Input modalities associated with speech = 109
2.1.5 Output modalities associated with speech = 112
2.1.6 Taxonomies of multimodal applications = 114
2.2 Survey of multimodal systems = 118
2.3 Evaluation of multimodal systems = 122
2.3.1 Types of evaluation = 123
2.3.2 Evaluation methodologies = 124
2.3.3 Specific evaluation issues = 127
2.3.4 Recommendations = 129
2.4 Speech input with facial information (audio-visual speech recognition) = 129
2.4.1 Face recognition = 129
2.4.2 Locating and tracking of other facial features = 130
2.4.3 Automatic lipreading systems = 131
2.4.4 Integration of audio and visual signals = 131
2.5 Speech output with talking heads = 132
2.5.1 Control techniques = 132
2.5.2 Lip shape computation = 137
2.5.3 Talking heads: audio and video output synchronisation = 138
2.6 Speech input with modalities other than faces = 138
2.6.1 Recognition of non-speech input modalities = 139
2.6.2 Integration in multimodal applications = 140
2.7 Speech output in multimedia systems = 145
2.7.1 Taxonomy of output modalities = 146
2.7.2 Output devices = 146
2.7.3 Theoretical issues = 147
2.7.4 Summary of recommendations = 155
2.8 Technology of multimodal system components = 157
2.8.1 Techniques related to face recognition systems = 157
2.8.2 Synthesis module = 163
2.8.3 Facial models = 164
2.8.4 Building conversational agents = 173
2.8.5 On-line character and handwriting recognition = 178
2.8.6 Gesture recognition = 183
2.8.7 Technical issues = 190
2.9 Standards and resources for multimodal/multimedia systems = 190
2.9.1 Standards and resources for monomodal processing = 190
2.9.2 Towards standards for multimedia systems = 191
2.9.3 Towards standards for hypermedia systems = 193
2.9.4 Architectures and toolkits for multimodal integration = 193
2.9.5 Notational systems = 195
2.9.6 Face and audio databases = 196
3 Consumer off-the-shelf (COTS) product and service evaluation = 204
3.1 Introduction = 204
3.1.1 Purpose and scope of this chapter = 204
3.1.2 Introduction to speech technologies and classification = 204
3.1.3 Automatic speech recognition = 205
3.1.4 Text-to-speech and speech synthesis = 206
3.1.5 Speaker recognition and verification = 208
3.1.6 Speech understanding = 208
3.1.7 Dialogue control = 209
3.2 General remarks = 209
3.2.1 Assessment methodology = 209
3.2.2 Subjective assessment measures = 213
3.2.3 Acoustic environment = 214
3.2.4 Comparing several systems = 216
3.3 Command and control systems = 216
3.3.1 Typical systems = 216
3.3.2 Typical issues = 218
3.3.3 Evaluation design = 220
3.3.4 Examples = 222
3.4 Document generation = 227
3.4.1 Typical systems = 227
3.4.2 Typical issues = 228
3.4.3 Evaluation design = 229
3.4.4 Examples = 229
3.5 Services and telephone applications = 233
3.5.1 Typical systems = 233
3.5.2 Typical issues = 234
3.5.3 Evaluation design = 234
3.5.4 Examples = 235
3.6 Conclusion and summary of recommendations = 238
4 Terminology for spoken language systems = 240
4.1 Introduction = 240
4.1.1 Terminology standards = 240
4.1.2 Termbank users = 242
4 1 3 Chapter outline = 243
4.2 Terminological basics = 243
4.2.1 Central notions in terminological theory = 243
4.2.2 Relations between terms = 247
4.3 The organisation of terminology = 249
4.3.1 The onomasiological and semasiological perspectives = 249
4.3.2 Terminological macrostructures and microstructures = 251
4.4 Spoken Language terminology = 252
4.4.1 The hybrid character of SL terminology = 252
4.4.2 Toward a microstructure for SL terminology = 253
4.4.3 Recommendations on termbank development = 259
4.4.4 Recommendations for further reading = 260
4.5 Relational databases = 261
4.5.1 Components of a relational database = 261
4.5.2 Structures in the relational model = 261
4.5.3 Codd's definition of a relational database system = 262
4.5.4 Query language = 262
4.5.5 Software implementations = 262
4.5.6 Distribution of data generation over time = 263
4.5.7 Distribution of data generation over resources = 263
4.5.8 Required system components = 264
4.6 Terminology Management Systems(TMSs), databases, and interchange formats = 264
4.6.1 MultiTerm = 264
4.6.2 ITU Telecommunication Terminology Database : TERMITE = 265
4.6.3 TERMIUM - Canadian Linguistic Data Bank = 267
4.6.4 EURODICAUTOM = 268
4.6.5 MARTIF terminology interchange format (ISO 12200) = 269
4.7 The EAGLET Term Database : an SL termbank = 271
4.7.1 A hypergraph-based approach = 271
4.7.2 Conceptual parts = 272
4.7.3 Information storage = 272
4.7.4 System components = 272
4.7.5 Structure = 273
4.7.6 EAGLET macrostructure for SL terminology = 273
4.7.7 EAGLET microstructure for SL terminology = 275
4.7.8 Using the EAGLET Term Database = 277
4.7.9 Future work = 280
5 Reference materials = 281
5.1 Introduction = 281
5.2 Organisations and infrastructure = 282
5.2.1 Speech resources, agencies, and associations = 282
5.2.2 Archives, general information = 291
5.2.3 Education and conferences = 293
5.3 "SLP at Work" = 296
5.3.1 Speech interfaces = 296
5.3.2 Telecommunications and broadcast = 297
5.3.3 New services = 298
5.3.4 SLP as a research tool = 298
5.4 SLP procedures, tools, and formats = 301
5.4.1 Annotation = 302
5.4.2 Validation, evaluation = 303
5.4.3 Tools and standards = 304
5.4.4 Text = 308
5.5 Technology = 309
5.5.1 Alphabets = 310
5.5.2 Networks = 310
5.5.3 File formats = 322
5.5.4 Programming = 324
5.5.5 Storage = 326
Bibliographical references = 329
A SAMPA and X-SAMPA phonetic symbols = 359
B The EAGLET tenn database = 367
B.1 Introduction = 367
B.2 EAGLET terminal(abridged) = 369
List of abbreviations = 497
Index = 503
CD-ROM disclaimer = 521