research

Vocabulary a week: #2 - Friday, October 20

Tags: | | |

For the next few months, we will be publishing small sets of customized vocabularies based on popular news items. The vocabularies will feature a set of new words which (fingers crossed) have never before existed in the common vocabulary known as the English language.

The vocabulary words and accompanying definitions are computer generated using a multi-step definition generator. Output of the generator will be enhanced and tweaked over time as new theories are tested and new ideas come to light. More information on the GTR Dictionary Project can be found here : GTR Dictionary Project.


Vocabulary a week: #1 - Friday, October 13

Tags: | | |

For the next few months, we will be publishing small sets of customized vocabularies based on popular news items. The vocabularies will feature a set of new words which (fingers crossed) have never before existed in the common vocabulary known as the English language.

The vocabulary words and accompanying definitions are computer generated using a multi-step definition generator. Output of the generator will be enhanced and tweaked over time as new theories are tested and new ideas come to light. More information on the GTR Dictionary Project can be found here : GTR Dictionary Project.


A web browser for experimental writers ?

Tags: | |

With the release of the open source Firefox browser, numerous spin off projects have been launched building off the core Firefox browsing functionality. Two of the more interesting offshoots are Flock and Songbird.

Flock advertises itself as a "social browser" because of its built in support for many social networking services such as Flickr, Del.icio.us, Technorati and others. Songbird describes itself as a mishmash digital jukebox, web browser and media player with some interesting features for "playing the web".


Google N-gram data released

Tags: | |

Fun times all around ! Google has finally released "version 1" of their N-gram word data sets through the Linguistic Data Consortium. For $150 US, you too can own the following on a 6 DVD set:

File sizes: approx. 24 GB compressed (gzip'ed) text files



Number of tokens: 1,024,908,267,229
Number of sentences: 95,119,665,584
Number of unigrams: 13,588,391
Number of bigrams: 314,843,401
Number of trigrams: 977,069,902
Number of fourgrams: 1,313,818,354
Number of fivegrams: 1,176,470,663

 


Google Research releasing N-gram data

Tags: | |

Google Research announced today that they will be releasing their N-gram data to the public. N-gram models are a type of statistical model used to predict the occurance likelihood of the next item in a sequence, in this case, the items are words. N-gram models are used in a number of computational linguistics tasks like translation, part of speech tagging and word sense disambiguation.


Why ALG is hard: interdisciplinarity

Tags: | | |

Over at Jim Carpenter's blog, he has a series building on why aesthetic language generation (ALG) is difficult. There are plenty of discussions on the nature of working with aesthetic text generation systems from a writer/reader perspective, but very little on the challenges and questions raised in the actual construction of these systems (outside of the purely technical discussions within fields like Computational Linguistics). Jim has constructed a large scale electronic text composition system entitled "Erica T. Carter" and knows first hand the issues involved. Anyone interested in this area should keep an eye on his posts.


Publications

Tags:

The purpose of this publications library is to archive publications, or links to websites hosting publications, related to the intersection of computer technology and writing.

This would include publications from various fields of language related computing: Computational Linguistics (CL), Natural Language Processing (NLP), Humanities Computing, Computer Poetry etc. GTR claims no rights or ownership to these works, copyright is held by the respective author.

analysis
  • Multilingual Algorithms for Literary Text Analysis (MALTA) (1989?)
    Authors: Paul Braffort, Josiane Joncquel (members of the ALAMO group)
    Visiting scholars, University of Chicago
    Visit website
  • Literalgorithmics: Wanderings in Linguistic and Literary Text Analysis (1989)
    Authors: Paul Braffort (ALAMO)
    internal ALAMO report
    Visit website
  • What is text analysis, really? (2003)
    Authors: Geoffrey Rockwell
    Literary and Linguistic Computing, Vol. 18, No. 2, 2003, p. 209-219.
    download [252.43 KB]
  • Segmenting Natural Language by Articulatory Features (1967)
    Authors: David Shillan
    Cambridge Language Research Unit
    download [457.11 KB]
  • About TextTiling
    Authors: Marti Hearst
    Visit website
  • Seeing the Text Through the Trees: Visualization and Interactivity in the Text Applications (1999)
    Authors: Dr. Geoffrey Rockwell, John Bradley and Patricia Monger
    Literary and Linguistic Computing, vol. 14, no. 1, 1999, p. 115-130.
    download [295.14 KB]
  • Discrimination of Authoship using Visualization (1994)
    Authors: Bradley Kjell, W. Addison Woods, Ophir Fieder
    Information Processing and Management, Volume 30, Number 1, 1994
    download [346.43 KB]
computer poetry
  • A Flexible Integrated Architecture For Generating Poetic Texts (2000)
    Authors: Hisar Manurung, Graeme Ritchie, Henry Thompson
    In Proceedings of the Fourth Symposium on Natural Language Processing (SNLP 2000), Chiang Mai, Thailand 10-12 May 2000
    download [182.19 KB]
  • Towards A Computational Model of Poetry Generation (2000)
    Authors: Hisar Maruli Manurung, Graeme Ritchie, Henry Thompson
    Informatics Research Report EDI-INF-RR-0015
    download [72.24 KB]
  • Linguistic Creativity at Different Levels of Decision in Sentence Production (2002)
    Authors: Pablo Gervás
    Universidad Complutense de Madrid
    download [161.53 KB]
  • Modeling Literary Style for Semi-Automatic Generation of Poetry (2001)
    Authors: Pablo Gervás
    Departamento de Sistemas Informáticos y Programación, Universidad Complutense de Madrid,
    download [21.73 KB]
  • Exploring Quantitative Evaluations of the Creativity of Automatic Poets (2002)
    Authors: Pablo Gervás
    Universidad Complutense de Madrid,
    download [66.55 KB]
  • Principles and Processes of Generative Literature: Questions to Literature (2005)
    Authors: Jean-Pierre Balpe
    dichtung-digital - journal für digitale ästhetik
    Visit website
  • Literalgorithmics: Wanderings in Linguistic and Literary Text Analysis (1989)
    Authors: Paul Braffort (ALAMO)
    internal ALAMO report
    Visit website
  • French e-poetry: A short/long story (2002)
    Authors: Patrick-Henri Burgaud
    dichtung-digital - journal für digitale ästhetik
    Visit website
  • Producing Computer Poetry (1977)
    Authors: Margaret Chisman
    The Best of Creative Computing Volume 2
    Visit website
  • Writers and Computers: An Interview With Carol Spearin McCauley (1977)
    Authors: Cathy Silverstein
    The Best of Creative Computing Volume 2
    Visit website
  • Haiku Generator (1977)
    Authors: Paul J. Emmerich
    The Best of Creative Computing Volume 2
    Visit website
  • COMPUTER POETRY (1992)
    Authors: M. Vincent van Mechelen
    Visit website
  • Gnoetry: interview with Eric Elshtain (2006)
    Authors: Christy Dena
    WRT: Writer Response Theory
    Visit website
  • The WASP Poetry Generation System (2005)
    Authors: Pablo Gervás
    Visit website
  • Reading Processes: Hartman’s Virtual Muse (2005)
    Authors: Noah Wardrip-Fruin
    http://grandtextauto.gatech.edu
    Visit website
  • TEAnO, an Organization for the Application of Computers to Art Production (1996)
    Authors: P. Ferrara - Etnoteam spa, TEAnO
    TEAnO
    download [179.55 KB]
    Visit website
  • Digital Poetry Timeline (2003)
    Authors: Christopher T. Funkhouser
    http://web.njit.edu/~funkhous
    Visit website
  • Poetry Digital Media and Cybertext (2004)
    Authors: Christopher T. Funkhouser
    http://web.njit.edu/~funkhous
    download [1.12 MB]
  • An Evolutionary Algorithm Approach to Poetry Generation (2003)
    Authors: Hisar Maruli Manurung
    Institute for Communicating and Collaborative Systems, School of Informatics, University of Edinburgh
    download [2.11 MB]
  • TCR September Launch Party
    download [162.99 KB]
frameworks
  • A Flexible Integrated Architecture For Generating Poetic Texts (2000)
    Authors: Hisar Manurung, Graeme Ritchie, Henry Thompson
    In Proceedings of the Fourth Symposium on Natural Language Processing (SNLP 2000), Chiang Mai, Thailand 10-12 May 2000
    download [182.19 KB]
  • Building Applied Natural Language Generation Systems (1997)
    Authors: Ehud Reiter and Robert Dale
    Journal of Natural Language Engineering ,
    download [382.44 KB]
  • TIPSTER Text Phase III : TIPSTER Text Architecture Design - (Version 3.1 7) (1998)
    Authors: Ralph Grishman and the TIPSTER Phase III Contractors.
    download [263.42 KB]
  • TIPSTER Text Phase III : Configuration Management Plan (Version 1.3, 25) (1997)
    Authors: Architecture Committee for the TIPSTER Text Phase III Program
    download [124.14 KB]
  • TIPSTER Text Phase II : Architecture Requirements - (Version 2.0.1 27) (1996)
    Authors: Architecture Committee for the TIPSTER Text Phase II Program
    download [173.15 KB]
  • TIPSTER Text Phase II : Architecture Concept - (Version 1.1.2 27) (1996)
    Authors: Architecture Committee for the TIPSTER Text Phase II Program
    download [143.69 KB]
  • Modeling Literary Style for Semi-Automatic Generation of Poetry (2001)
    Authors: Pablo Gervás
    Departamento de Sistemas Informáticos y Programación, Universidad Complutense de Madrid,
    download [21.73 KB]
  • Multilingual Algorithms for Literary Text Analysis (MALTA) (1989?)
    Authors: Paul Braffort, Josiane Joncquel (members of the ALAMO group)
    Visiting scholars, University of Chicago
    Visit website
  • Literalgorithmics: Wanderings in Linguistic and Literary Text Analysis (1989)
    Authors: Paul Braffort (ALAMO)
    internal ALAMO report
    Visit website
  • Gnoetry: interview with Eric Elshtain (2006)
    Authors: Christy Dena
    WRT: Writer Response Theory
    Visit website
  • The WASP Poetry Generation System (2005)
    Authors: Pablo Gervás
    Visit website
  • Seeing the Text Through the Trees: Visualization and Interactivity in the Text Applications (1999)
    Authors: Dr. Geoffrey Rockwell, John Bradley and Patricia Monger
    Literary and Linguistic Computing, vol. 14, no. 1, 1999, p. 115-130.
    download [295.14 KB]
  • FROGS: A Framework for Developing Natural Language Generation Software (2005)
    Authors: Pablo Gervás, Raquel Hervás Ballesteros, Carlos García Ibáñez and Miguel Ancochea Nodal
    Visit website
history
  • Natural Language Processing: a historical review (2001)
    Authors: Karen Sparck Jones
    Computer Laboratory, University of Cambridge
    download [402.71 KB]
  • Machine Translation at the Cambridge Language Research Unit 1956-1967 (1986)
    Authors: John Hutchins
    Machine translation: past, present, future
    download [175.95 KB]
    Visit website
  • Digital Poetry Timeline (2003)
    Authors: Christopher T. Funkhouser
    http://web.njit.edu/~funkhous
    Visit website
  • Poetry Digital Media and Cybertext (2004)
    Authors: Christopher T. Funkhouser
    http://web.njit.edu/~funkhous
    download [1.12 MB]
  • Interactive Reading, Early Modern Texts and Hypertext: A Lesson from the Past
    Authors: Tatjana Chorney
    http://www.academiccommons.org
    Visit website
literary
  • TCR September Launch Party
    download [162.99 KB]
machine translation
  • Machine Translation at the Cambridge Language Research Unit 1956-1967 (1986)
    Authors: John Hutchins
    Machine translation: past, present, future
    download [175.95 KB]
    Visit website
modelling
  • Machine humour: An implemented model of puns (1996)
    Authors: Kim Binsted
    Ph.D. University of Edinburgh
    download [973.28 KB]
  • Linguistic Creativity at Different Levels of Decision in Sentence Production (2002)
    Authors: Pablo Gervás
    Universidad Complutense de Madrid
    download [161.53 KB]
  • Exploring Quantitative Evaluations of the Creativity of Automatic Poets (2002)
    Authors: Pablo Gervás
    Universidad Complutense de Madrid,
    download [66.55 KB]
  • Computer Understanding of Conventional Metaphoric Language (1992)
    Authors: James H. Martin
    University of California, Berkely
    download [349.28 KB]
  • Modelling: a study in words and meanings (2003)
    Authors: Willard McCarty
    King’s College London
    download [293.82 KB]
  • Computational Modelling of Linguistic Humour: Tom Swifties (1992)
    Authors: Greg Lessard and Michael Levison
    ALLC/ACH Conference, 1992, Oxford
    Visit website
  • A Mathematical Theory of Communication (1948)
    Authors: Claude E. Shannon
    Bell System Technical Journal
    download [357.71 KB]
  • Context as Spurious Concept (1997)
    Authors: Graeme Hirst
    AAAI Fall Symposium on Context in Knowledge Representation and Natural Language, Cambridge, Massachusetts, 1997
    download [70.82 KB]
  • Semantic Networks
    Authors: John F. Sowa
    Visit website
morphology
  • An Algorithmic Approach to English Pluralization
    Authors: Damian Conway
    School of Computer Science and Software Engineering, Monash University, Australia
    Visit website
  • Morphological Parsing with a Unification-based Word Grammar (PC-KIMMO) (1994)
    Authors: Evan L. Antworth
    North Texas Natural Language Processing Workshop
    Visit website
  • A Simpler, Intuitive Approach to Morpheme Induction (2005)
    Authors: Samarth Keshava and Emily Pitler
    MorphoChallenge 2005, http://www.cis.hut.fi/morphochallenge2005/
    download [105.45 KB]
  • Computational Morphology (A handbook)
    Authors: Harald Trost
    Das Institut für Medizinische Kybernetik und Artificial Intelligence (IMKAI)
    Visit website
  • Morphological Analysis (1973)
    Authors: Martin Kay
    Computational And Mathematical Linguistics: Proceedings of the International Conference on Computational Linguistics
    download [854.16 KB]
narrative
  • Narrative Prose Generation (2001)
    Authors: Charles Callaway and James Lester
    In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Seattle, WA,
    download [542.19 KB]
  • Narratology: A Guide to the Theory of Narrative (2005)
    Authors: Manfred Jahn
    English Department, University of Cologne
    download [496.33 KB]
    Visit website
NLG (Natural Language Generation)
  • A Flexible Integrated Architecture For Generating Poetic Texts (2000)
    Authors: Hisar Manurung, Graeme Ritchie, Henry Thompson
    In Proceedings of the Fourth Symposium on Natural Language Processing (SNLP 2000), Chiang Mai, Thailand 10-12 May 2000
    download [182.19 KB]
  • Towards A Computational Model of Poetry Generation (2000)
    Authors: Hisar Maruli Manurung, Graeme Ritchie, Henry Thompson
    Informatics Research Report EDI-INF-RR-0015
    download [72.24 KB]
  • Narrative Prose Generation (2001)
    Authors: Charles Callaway and James Lester
    In Proceedings of the Seventeenth International Joint Conference on Artificial Intelligence, Seattle, WA,
    download [542.19 KB]
  • Building Applied Natural Language Generation Systems (1997)
    Authors: Ehud Reiter and Robert Dale
    Journal of Natural Language Engineering ,
    download [382.44 KB]
  • The Practical Value of n-grams in generation (1998)
    Authors: Langkilde, I. and Knight,K
    In Proceedings of the 9th International Natural Language Workshop (INLG), Niagara-on-the-Lake, Ontario.
    download [166.52 KB]
  • Linguistic Creativity at Different Levels of Decision in Sentence Production (2002)
    Authors: Pablo Gervás
    Universidad Complutense de Madrid
    download [161.53 KB]
  • Modeling Literary Style for Semi-Automatic Generation of Poetry (2001)
    Authors: Pablo Gervás
    Departamento de Sistemas Informáticos y Programación, Universidad Complutense de Madrid,
    download [21.73 KB]
  • Exploring Quantitative Evaluations of the Creativity of Automatic Poets (2002)
    Authors: Pablo Gervás
    Universidad Complutense de Madrid,
    download [66.55 KB]
  • Principles and Processes of Generative Literature: Questions to Literature (2005)
    Authors: Jean-Pierre Balpe
    dichtung-digital - journal für digitale ästhetik
    Visit website
  • Literalgorithmics: Wanderings in Linguistic and Literary Text Analysis (1989)
    Authors: Paul Braffort (ALAMO)
    internal ALAMO report
    Visit website
  • The WASP Poetry Generation System (2005)
    Authors: Pablo Gervás
    Visit website
  • Reading Processes: Hartman’s Virtual Muse (2005)
    Authors: Noah Wardrip-Fruin
    http://grandtextauto.gatech.edu
    Visit website
  • FROGS: A Framework for Developing Natural Language Generation Software (2005)
    Authors: Pablo Gervás, Raquel Hervás Ballesteros, Carlos García Ibáñez and Miguel Ancochea Nodal
    Visit website
  • TEAnO, an Organization for the Application of Computers to Art Production (1996)
    Authors: P. Ferrara - Etnoteam spa, TEAnO
    TEAnO
    download [179.55 KB]
    Visit website
  • An Evolutionary Algorithm Approach to Poetry Generation (2003)
    Authors: Hisar Maruli Manurung
    Institute for Communicating and Collaborative Systems, School of Informatics, University of Edinburgh
    download [2.11 MB]
NLP (Natural Language Processing)
  • Natural Language Processing: a historical review (2001)
    Authors: Karen Sparck Jones
    Computer Laboratory, University of Cambridge
    download [402.71 KB]
  • Language Processing and the Thesaurus (1997)
    Authors: Yorick Wilks
    http://www.dcs.shef.ac.uk/~yorick
    download [48.85 KB]
parsing
  • A Perl program for sentence splitting using rules (2001)
    Authors: Paul Clough
    University of Sheffield
    download [614.24 KB]
  • What is a word, What is a sentence? Problems of Tokenization (1994)
    Authors: Gregory Grefenstette, Pasi Tapanainen
    Rank Xerox Research Centre Grenoble Laboratory
    download [205.95 KB]
readings
  • TCR September Launch Party
    download [162.99 KB]
segmentation
  • TextTiling: A Quantitative Approach to Discourse Segmentation (1993 )
    Authors: Marti A. Hearst
    Technical Report UCB:S2K-93-24
    download [80.98 KB]
  • A Perl program for sentence splitting using rules (2001)
    Authors: Paul Clough
    University of Sheffield
    download [614.24 KB]
  • What is a word, What is a sentence? Problems of Tokenization (1994)
    Authors: Gregory Grefenstette, Pasi Tapanainen
    Rank Xerox Research Centre Grenoble Laboratory
    download [205.95 KB]
  • Segmenting Natural Language by Articulatory Features (1967)
    Authors: David Shillan
    Cambridge Language Research Unit
    download [457.11 KB]
  • About TextTiling
    Authors: Marti Hearst
    Visit website
semantics
  • Semantic Networks
    Authors: John F. Sowa
    Visit website
  • Language Processing and the Thesaurus (1997)
    Authors: Yorick Wilks
    http://www.dcs.shef.ac.uk/~yorick
    download [48.85 KB]
stylistics
  • Analysing Style - Readability (2000)
    Authors: Paul Clough
    METER Corpus Experiments
    download [85.47 KB]
  • Modeling Literary Style for Semi-Automatic Generation of Poetry (2001)
    Authors: Pablo Gervás
    Departamento de Sistemas Informáticos y Programación, Universidad Complutense de Madrid,
    download [21.73 KB]
  • Exploring Quantitative Evaluations of the Creativity of Automatic Poets (2002)
    Authors: Pablo Gervás
    Universidad Complutense de Madrid,
    download [66.55 KB]
  • Discrimination of Authoship using Visualization (1994)
    Authors: Bradley Kjell, W. Addison Woods, Ophir Fieder
    Information Processing and Management, Volume 30, Number 1, 1994
    download [346.43 KB]
  • Minnesota Contextual Content Analysis system (2001)
    Authors: Ken Litkowski and Don McTavish
    Visit website
  • Style vs. Expression in Literary Narratives (2005)
    Authors: Uzuner, Ö., Katz, B.
    Proceedings of the Twenty-eighth Annual International ACM SIGIR Conference (SIGIR 2005)
    Visit website
summarization
  • Statistics-Based Summarization Step One: Sentence Compression (2000)
    Authors: Kevin Knight, Daniel Marcu
    AAAI/IAAI.
    download [200.19 KB]
  • Producing Intelligent Telegraphic Text Reduction to Provide an Audio Scanning Service for the Blind (1998)
    Authors: Gregory Grefenstette
    AAAI Spring Symposium
    download [145.29 KB]
  • TextTiling: A Quantitative Approach to Discourse Segmentation (1993 )
    Authors: Marti A. Hearst
    Technical Report UCB:S2K-93-24
    download [80.98 KB]
tagging
  • A Simple Rule-Based Part Of Speech Tagger (1992)
    Authors: Eric Brill
    Proceedings of ANLP-92, 3rd Conference on Applied Natural Language Processing
    download [152.21 KB]
  • Transformation-Based Error-Driven Learning and Natural Language Processing: A Case Study in Part of Speech Tagging (1995)
    Authors: Eric Brill.
    Computational Linguistics
    download [274 KB]
  • Deterministic Part-of-Speech Tagging with Finite State Transducers (1995)
    Authors: Emmanuel Roche, Yves Schabes
    Computational Linguistics
    download [380.29 KB]
  • Building a large annotated corpus of English: the Penn Treebank (1993)
    Authors: Mitchell P. Marcus, Beatrice Santorini, Mary Ann Marcinkiewicz
    University of Pennsylvania
    Visit website
theory
  • Principles and Processes of Generative Literature: Questions to Literature (2005)
    Authors: Jean-Pierre Balpe
    dichtung-digital - journal für digitale ästhetik
    Visit website
  • Multilingual Algorithms for Literary Text Analysis (MALTA) (1989?)
    Authors: Paul Braffort, Josiane Joncquel (members of the ALAMO group)
    Visiting scholars, University of Chicago
    Visit website
  • Literalgorithmics: Wanderings in Linguistic and Literary Text Analysis (1989)
    Authors: Paul Braffort (ALAMO)
    internal ALAMO report
    Visit website
  • French e-poetry: A short/long story (2002)
    Authors: Patrick-Henri Burgaud
    dichtung-digital - journal für digitale ästhetik
    Visit website
  • COMPUTER POETRY (1992)
    Authors: M. Vincent van Mechelen
    Visit website
  • Modelling: a study in words and meanings (2003)
    Authors: Willard McCarty
    King’s College London
    download [293.82 KB]
  • What is text analysis, really? (2003)
    Authors: Geoffrey Rockwell
    Literary and Linguistic Computing, Vol. 18, No. 2, 2003, p. 209-219.
    download [252.43 KB]
  • A Mathematical Theory of Communication (1948)
    Authors: Claude E. Shannon
    Bell System Technical Journal
    download [357.71 KB]
  • Poetry Digital Media and Cybertext (2004)
    Authors: Christopher T. Funkhouser
    http://web.njit.edu/~funkhous
    download [1.12 MB]
  • Context as Spurious Concept (1997)
    Authors: Graeme Hirst
    AAAI Fall Symposium on Context in Knowledge Representation and Natural Language, Cambridge, Massachusetts, 1997
    download [70.82 KB]
  • Interactive Reading, Early Modern Texts and Hypertext: A Lesson from the Past
    Authors: Tatjana Chorney
    http://www.academiccommons.org
    Visit website
  • Narratology: A Guide to the Theory of Narrative (2005)
    Authors: Manfred Jahn
    English Department, University of Cologne
    download [496.33 KB]
    Visit website
visualization
  • Seeing the Text Through the Trees: Visualization and Interactivity in the Text Applications (1999)
    Authors: Dr. Geoffrey Rockwell, John Bradley and Patricia Monger
    Literary and Linguistic Computing, vol. 14, no. 1, 1999, p. 115-130.
    download [295.14 KB]

Syndicate content