[1811.01824] Structured Neural Summarization
Summarization of long sequences into a concise statement is a core problem in natural language processing, requiring non-trivial understanding of the input. Based on the promising results of graph neural networks on highly structured data, we develop a framework to extend existing sequence encoders with a graph component that can reason about long-distance relationships in weakly structured data such as text. In an extensive evaluation, we show that the resulting hybrid sequence-graph models outperform both pure sequence models as well as pure graph models on a range of summarization tasks.
Unsupervised Text Summarization using Sentence Embeddings
Overview of an approach used to perform Text Summarization in Python
Get To The Point: Summarization with Pointer-Generator Network
Neural sequence-to-sequence models have
provided a viable new approach for abstractive text summarization (meaning
they are not restricted to simply selecting
and rearranging passages from the original text). However, these models have two
shortcomings: they are liable to reproduce
factual details inaccurately, and they tend
to repeat themselves. In this work we propose a novel architecture that augments the
standard sequence-to-sequence attentional
model in two orthogonal ways. First,
we use a hybrid pointer-generator network
that can copy words from the source text
via pointing, which aids accurate reproduction of information, while retaining the
ability to produce novel words through the
generator. Second, we use coverage to
keep track of what has been summarized,
which discourages repetition. We apply
our model to the CNN / Daily Mail summarization task, outperforming the current abstractive state-of-the-art by at least 2 ROUGE points
Attentional, RNN-based encoder-decoder models for abstractive summarization
have achieved good performance on short input and output sequences. For longer
documents and summaries however these models often include repetitive and
incoherent phrases. We introduce a neural network model with a novel intraattention that attends over the input and continuously generated output separately,
and a new training method that combines standard supervised word prediction and
reinforcement learning (RL). Models trained only with supervised learning often
exhibit “exposure bias” – they assume ground truth is provided at each step during
training. However, when standard word prediction is combined with the global sequence prediction training of RL the resulting summaries become more readable.
We evaluate this model on the CNN/Daily Mail and New York Times datasets.
Our model obtains a 41.16 ROUGE-1 score on the CNN/Daily Mail dataset, an
improvement over previous state-of-the-art models. Human evaluation also shows that our model produces higher quality summaries.
