Opportunities for massive extensions of linguistic annotation research and applications

D. Terence Langendoen, National Science Foundation

The US National Science Foundation is responding to the rapidly increasing data deluge and corresponding increases in computing power and global networking by undertaking new initiatives in such areas as interoperability*, sustainable data archives**, and cyber-enabled discovery and innovation***. These initiatives, together with NSF's standing programs and activities in Robust Intelligence+, Linguistics++, Documenting Endangered Languages+++ and related areas provide great opportunities for extending foundational work in and developing applications for rich linguistic annotation frameworks, particularly if combined with similar initiatives being developed in Europe, Japan and elsewhere. If fully interoperable, semantically-aware annotated data sets can be developed for the languages of the world, computational linguists in collaboration with other researchers will be able to address the grand challenge questions that we have not been able to tackle before, possibly in time for consideration in the coming decade, which may be identified as the Decade of the Mind^.


* http://www.nsf.gov/pubs/2007/nsf07565/nsf07565.pdf

** http://www.nsf.gov/pubs/2007/nsf07601/nsf07601.pdf

*** http://www.nsf.gov/pubs/2007/nsf07603/nsf07603.pdf

+ http://www.nsf.gov/pubs/2007/nsf07577/nsf07577.pdf

++ http://www.nsf.gov/funding/pgm_summ.jsp?pims_id=5408

+++ http://www.nsf.gov/pubs/2006/nsf06577/nsf06577.pdf

^ http://www.sciencemag.org/cgi/content/full/317/5843/1321b