Automatic Detection of Answers to Research Questions from Medline
Abstracts
Abdulaziz Alamri and Mark Stevenson
Department of Computer Science
The University of Sheffield
Sheffield, UK
adalamri1@sheffield.ac.uk; mark.stevenson@sheffield.ac.uk
Abstract Incorporating a middle tier system between the
search engine and the user will be useful to min-
Given a set of abstracts retrieved from a imize the effort required to filter the results. This
search engine such as Pubmed, we aim to research presents a system that aids those search-
automatically identify the claim zone in ing for studies that discuss a particular research
each abstract and then select the best sen- question. The system acts as a mediator between
tence(s) from that zone that can serve as the search engine and the user. It interprets the
an answer to a given query. The system search engine results and returns the most infor-
can provide a fast access mechanism to the mative sentence(s) from the claim zone of each
most informative sentence(s) in abstracts abstract that are potential answers to the research
with respect to the given query. question. The system reduces the cognitive loads
on the user by assisting their identification of rele-
1 Introduction vant claims within abstracts
The large amount of medical literature hinders The system comprises two components. The
professionals from analyzing all the relevant first component identifies the claim zone in
knowledge to particular medical questions. Search each abstract using the rhetorical moves principle
engines are increasingly used to access such in- (Teufel and Moens, 2002), and the second compo-
formation. However, such systems retrieve docu- nent uses the sentences in the claim zone to pre-
ments based on the appearance of the query terms dict the most informative sentence(s) from each
in the text despite the fact that they may describe abstract to the given query.
another problem. This paper makes three contributions: present-
The search engine Pubmed R for example is a ing a new set of features to build a classifier to
well known IR system to access more than 24 mil- identify the structure role of sentences in an ab-
lion abstracts for the biomedical literature includ- stract that is at least shows similar performance to
ing Medline R (Wheeler et al., 2008). The engine the current systems; building a classifier to detect
takes a query from user and returns a list of ab- the best sentence(s) (lexically) that can be an an-
stracts that can be relevant or partially irrelevant swer to a given query; and introducing a new fea-
to the query, which requires from the user to go ture (Z-score) for this task.
through each abstract for further analysis and eval-
2 Related Work
uation.
Researchers who conduct a systematic review We are not aware of any work that has explicitly
(Gough et al., 2012) tend to use the same approach discussed the detection of claim sentence most re-
to collect the studies of interest; however, they lated to a predefined question, however, studies
are found to spend significant effort identifying have discussed related research.
the studies that are relevant to the research ques- Ruch et al. (2007) for example used the rhetori-
tion. Relevancy is usually measured by scanning cal moves approach to identify the conclusion sen-
the result and conclusion sections to identify au- tences in abstracts. Their system was based on a
thors claim and then comparing the claim with the Bayesian classifier, and normalized n-grams and
review question; where a claim can be defined as relative position features. The main objective of
the summary of the main points presented in a re- that research was to identify sentences that belong
search argument. to the conclusion sections of abstracts; they re-
141
Proceedings of the 2015 Workshop on Biomedical Natural Language Processing (BioNLP 2015), pages 141–146,
Beijing, China, July 30, 2015. c 2015 Association for Computational Linguistics
, garded such information as key information to de- National Library of Medicine (NLM) have re-
termine the research topic. Our research is similar ported that 2,779 headings have been used to label
to that work since we use the conclusion section abstracts sections in Medline (Ripple et al., 2012).
to identify the key information in an abstract with Relying on the labels provided by the abstracts
respect to a query, but we also include the result authors to identify the roles of the sentences could
sections. be useful for research purpose; but in practice
Hirohata et al. (2008) showed a similar sys- this means all Medline abstracts need to be re-
tem using CRFs to classify the abstract sentences annotated even the structured abstracts to guaran-
into four categories: objective, methods, results, tee that they are labelled with the same set of an-
and conclusions. That classifier takes into account notations to understand their roles. This is not ef-
the neighbouring features in sentence Sn such as ficient especially when we consider the huge vol-
the n-grams of the previous sentence Sn−1 and the ume of the Medline repository.
next sentence Sn+1 .
To accommodate that problem, we use the NLM
Agarwal et al. (2009) described a system that
category value assigned to each section in the
automatically classifies sentences appear in full
XML abstract (nlmCategory attribute). The NLM
biomedical articles into one of four rhetorical cat-
assigns five possible values (categories): Objec-
egories: introduction, methods, results and discus-
tive, Background, Methods, Results and Conclu-
sions. The best system was achieved using Multi-
sions. This research uses these categories as an
nominal Naive Bayes. They reported that their
alternative way to learn the roles of abstracts sen-
system outperformed their baseline system which
tences. This resolves two problems: first, the roles
was a rule-based.
of sentences in structured abstracts can be auto-
Recently, Yepes et al. (2013) described a system
matically learned from the the value of the nlm-
to index Gene Reference Into Function (GeneRIF)
Category attribute without any further processing,
sentences that show novel functionality of genes
consequently, the roles of sentences in 30% of
mentioned in Medline. The goal of that work
the Medline abstracts can be accurately identified;
was to choose the most likely sentences to be se-
second, those labels can be used to build a machine
lected for GeneRIF indexing. The best system was
learning classifier to predict the role sentences of
achieved using Naive Bayes classifier and various
the unstructured abstracts in Medline.
features including the discourse annotations (the
NLM category labels) for the abstracts sentences. The claim zoning component regards identify-
Our research is close to Hirohata et al. (2008) ing the roles of sentences as a sequence labelling
system since we use the same algorithm, but use a problem. This requires an algorithm that takes
different set of features to build the model. More- into account the neighbouring observations rather
over, it similar to Yepes et al.(2013) system since than only current observation as in other ordinary
we use the value of the nlmCategory attribute classifiers e.g. SVM and Naive bayes. Condi-
rather than the labels provided by the authors to tional Random Fields (CRF) algorithm have been
learn the role of sentences. used successfully for such task (Hirohata et al.,
2008; Lin et al., 2009). Therefore, we use the
3 Method CRF algorithm along with lexical, structural and
sequential features to build a classifier model to
3.1 Claim Zoning Component identify the claim zones in abstracts. The clas-
This component is based on the hypothesis that the sifier is implemented using the CRFsuite library
contribution of a research paper tend to be found (Okazaki, 2007) using L-BFGS method. Note that
within the result or conclusion sections of its ab- we modify the NLM five categories to become
stract (Lin et al., 2009). Identifying these sections four where the Background and Objective cate-
manually especially in unstructured abstracts is a gories are merged into a new category called Intro-
tedious task. Medical abstracts tend to have logi- duction. That is because the background and ob-
cal structure (Orasan, 2001) in which each section jectives sections in Medline tend to overlap with
represent a different role. each other (Lin et al., 2009). Moreover, these
Unfortunately, about 70% of Medline abstracts sections usually appear sequentially and merging
are unstructured (have no section labels). Struc- them together is sensible to avoid the overlapping
tured abstracts use a variety of these labels. The problem. Therefore, this component identifies the
142