A System Description of P^4: Possible Punctuation Points Parser

نویسندگان

  • Thomas Boehnlein
  • Jennifer Seitzer
چکیده

We present a Natural Language Understanding (NLU) implementation that automatically inserts punctuation marks into a sequence of words to create a group of one or more syntactically correct sentences. The software, Possible Punctuation Points Parser (P^4) provides the ability for the user to input a string of words to process, performs the punctuation possibilities, and then provides several visualizations to illustrate how the software arrived at its final solution. P^4 uses a chart parsing algorithm combined with a search algorithm that creates data visualization structures. A potential application of this software is to serve as a formidable starting point for automatic punctuation mark insertion during voice-to-text conversion found on many mobile platforms.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Assessment of safety in drinking water supply system of Birjand city using World Health Organization’s water safety plan

Background: The conventional method for managing drinking water quality is not a suitable preventive strategy for protecting public health. A water safety plan (WSP) presents a systematic approach to ensuring the health and quality of drinking water. This study assessed the drinking water supply system safety of Birjand city using the WHO’s WSP. Methods: This investigation employed the WSP-QA ...

متن کامل

Punctuation Processing for Projective Dependency Parsing

Modern statistical dependency parsers assign lexical heads to punctuations as well as words. Punctuation parsing errors lead to low parsing accuracy on words. In this work, we propose an alternative approach to addressing punctuation in dependency parsing. Rather than assigning lexical heads to punctuations, we treat punctuations as properties of their neighbouring words, used as features to gu...

متن کامل

Developing and Evaluating a Probabilistic LR Parser of Part-of-Speech and Punctuation Labels

We describe an approach to robust domain-independent syntactic parsing of unrestricted naturally-occurring (English) input. The technique involves parsing sequences of part-ofspeech and punctuation labels using a unification-based grammar coupled with a probabilistic LR parser. We describe the coverage of several corpora using this grammar and report the results of a parsing experiment using pr...

متن کامل

Automata-guided Context-free parsing for punctuationless languages

We propose a system for analyzing texts written in languages which don't make use of punctuation, with syntactic tagging in mind. The core system is a simple chart parser, but to cope with the complexity and ambiguity problems, we use simpliied nite-state automata, which guide the analysis. An application to Ancient Egyptian texts is introduced.

متن کامل

Sentence-Internal Prosody Does not Help Parsing the Way Punctuation Does

This paper investigates the usefulness of sentence-internal prosodic cues in syntactic parsing of transcribed speech. Intuitively, prosodic cues would seem to provide much the same information in speech as punctuation does in text, so we tried to incorporate them into our parser in much the same way as punctuation is. We compared the accuracy of a statistical parser on the LDC Switchboard treeb...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2013