Work in progress

Guided Neural Language Generation for Abstractive Summarization Using AMR

This repository contains code for our EMNLP 2018 paper "Guided Neural Language Generation for Abstractive Summarization Using AMR"

Obtaining the Dataset

We used the Abstract Meaning Representation Annotation Release 2.0 which contains manually annotated document and summary AMR.

Preprocessing the Data

For preprocessing, clone the AMR preprocessing repository.

git clone https://github.com/sheffieldnlp/AMR-Preprocessing

Run the AMR linearizing where the input is the system summary AMR from Liu's summarizer ($F) and the AMR raw dataset ($AMR). Here we use the test dataset. Run the preprocessing on the training, and validation dataset if you want to train the model.

export F_TRAIN=/<path to AMR proxy train>/amr-release-2.0-amrs-training.txt
export F_TEST=/<path to AMR proxy train>/amr-release-2.0-amrs-test.txt
export F_DEV=/<path to AMR proxy train>/amr-release-2.0-amrs-dev.txt
export OUTPUT=/<output path for the results>/
python var_free_amrs.py -f $F_TRAIN -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
python var_free_amrs.py -f $F_TEST -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var
python var_free_amrs.py -f $F_DEV -output_path $OUTPUT --custom_parentheses -no_semantics --delete_amr_var

For each set (train, test and dev) the script will produce a set of two files: the sentence (.sent) and its respective linearized AMR (.tf) files.

Training New Model

export SRC=/<path to the linearized AMR tf training file>/all_amr-release-2.0-amrs-training.txt.tf
export TGT=/<path to the sentence training file>/all_amr-release-2.0-amrs-training.txt.sent
export SRC_VALID=/<path to the linearized AMR tf validation file>/all_amr-release-2-0-amrs-dev-all.txt.tf
export TGT_VALID=/<path to the sentence training file>/all_amr-release-2-0-amrs-dev-all.txt.sent
export SAVE=/<path to save directory>/

python preprocess.py -train_src $SRC -train_tgt $TGT -valid_src $SRC_VALID -valid_tgt $TGT_VALID -save_data $SAVE -src_seq_length 1000 -tgt_seq_length 1000 -shuffle 1

export F=/<path to test summarizer output>/summ_ramp_10_passes_len_edges_exp_0
export OUTPUT=/<path to test preprocessed output>
export AMR=/<path to AMR>/amr-release-2.0-amrs-test-proxy.txt
python var_free_amrs.py -is_dir -f $F -output_path $OUTPUT --custom_parentheses --no_semantics --delete_amr_var --with_side -side_file $AMR

python $WORK/train.py -data $PREPROCESS/van_noord/no_filter_amr_2/data -save_model $MODEL/$TYPE -rnn_size 500 -layers 2 -epochs 2000 -optim sgd -learning_rate 1 -learning_rate_decay 0.8 -encoder_type brnn -global_attention general -seed 1 -dropout 0.5 -batch_size 32

Generation with New Model

python $WORK/translate.py -src $file -output $I N P U T / g e n / s u m m_{r} i g o t r i o_{f} l u e n t_{s} i d e /$ (basename $file).system -model $MODEL/rse/sprint_1/acc_53.28_ppl_46.79_e126.pt -replace_unk -side_src $INPUT/processed/rigotrio/body$(basename $file).s -side_tgt

Missing superscript or subscript argument

$INPUT/processed/rigotrio/body_$

(basename $file).sent.s -beam_size 5 -max_length 100 -n_best 1 -batch_size 1 -verbose -psi 0.95 -theta 2.5 -k 15

Name	Name	Last commit message	Last commit date
Latest commit blodstone Instruction skeleton, need more revisions Sep 25, 2019 be73a36 · Sep 25, 2019 History 1,682 Commits
available_models	available_models	Fixing example config	Apr 3, 2018
data	data	.	Sep 10, 2017
docs	docs	fixup preprocess	Apr 11, 2018
onmt	onmt	Finalize guidance mechanism	Aug 23, 2018
test	test	fixup preprocess	Apr 11, 2018
tools	tools	Documentation update	Aug 29, 2018
.gitignore	.gitignore	Implemented phrase table	May 8, 2018
.travis.yml	.travis.yml	.	Mar 11, 2018
CONTRIBUTING.md	CONTRIBUTING.md	Rename CONTRIBUTORS.md to CONTRIBUTING.md to comply with opensouce co…	Dec 27, 2017
Dockerfile	Dockerfile	Added a dockerfile	Sep 18, 2017
LICENSE.md	LICENSE.md	correct setting of licence for github	Nov 3, 2017
README.md	README.md	Instruction skeleton, need more revisions	Sep 25, 2019
github_deploy_key_opennmt_opennmt_py.enc	github_deploy_key_opennmt_opennmt_py.enc	.	Dec 22, 2017
preprocess.py	preprocess.py	Finalize guidance mechanism	Aug 23, 2018
requirements.opt.txt	requirements.opt.txt	.	Jan 24, 2018
requirements.txt	requirements.txt	Update requirements.txt	Jan 3, 2018
server.py	server.py	Server now returns avg pred score and source	Apr 10, 2018
setup.py	setup.py	.	Dec 22, 2017
train.py	train.py	Added side information feature. There is still bug for preprocessing.…	May 1, 2018
translate.py	translate.py	Finalize guidance mechanism	Aug 23, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Work in progress

Guided Neural Language Generation for Abstractive Summarization Using AMR

Obtaining the Dataset

Preprocessing the Data

Training New Model

Generation with New Model

About

Releases

Packages

Languages

License

sheffieldnlp/AMR2Text-summ

Folders and files

Latest commit

History

Repository files navigation

Work in progress

Guided Neural Language Generation for Abstractive Summarization Using AMR

Obtaining the Dataset

Preprocessing the Data

Training New Model

Generation with New Model

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages