Exam (elaborations)

Novelty based Ranking of Human Answers for Community Questions

7 views 0 purchase

Course
Novelty based Ranking of Human

Institution
Novelty Based Ranking Of Human

2.1 Answer Ranking Answer ranking is essential for CQA services due to the high variance in the quality of answers. Several previous studies dealt with answer ranking for CQA sites [21, 22, 6, 27, 42, 45]. Jeon et al. [21] predicted answer quality using non-textual features of the answers. Bia...

[Show more]

Preview 2 out of 10 pages

View example

Uploaded on August 10, 2024
Number of pages 10
Written in 2024/2025
Type Exam (elaborations)
Contains Questions & answers

novelty based ranking of human answers for communi

Institution Novelty based Ranking of Human
Course Novelty based Ranking of Human

TIFFACADEMICS Member since 1 year 532 documents sold

$14.49

Added

Add to cart Add to wishlist

100% satisfaction guarantee
Immediately available after payment
Both online and in PDF
No strings attached

Novelty based Ranking of Human Answers for Community
Questions

Adi Omari David Carmel, Oleg Rokhlenko,
Technion IIT Idan Szpektor
Haifa,32000, Israel Yahoo Research
omari@cs.technion.ac.il Haifa 31905, Israel
{dcarmel,olegro,idan}@yahoo-inc.com

ABSTRACT Overflow, hundreds of millions of answered questions have been
Questions and their corresponding answers within a community- collected. The answers to the questions are not only viewed by the
based question answering (CQA) site are frequently presented as asker herself, but are frequently presented as top search results for
top search results for Web search queries and viewed by millions of Web search queries and viewed by millions of searchers daily, in
searchers daily. The number of answers for CQA questions ranges the form of a question and its corresponding answers.
from a handful to dozens, and a searcher would be typically inter- The number of answers for CQA questions ranges from a hand-
ested in the different suggestions presented in various answers for ful to dozens, and even sometimes hundreds in cases of popular
a question. Yet, especially when many answers are provided, the questions. We found that in Yahoo Answers more than 38% of
viewer may not want to sift through all answers but to read only the answered questions have at least 5 answers. For some ques-
the top ones. Prior work on answer ranking in CQA considered tions, such as factoids, readers would be content with a single high-
the qualitative notion of each answer separately, mainly whether it quality answer. However, in other types of questions, such as ask-
should be marked as best answer. We propose to promote CQA ing for recommendations or opinions, the asker as well as other
answers not only by their relevance to the question but also by the viewers would benefit from different views or suggestions. Still, es-
diversification and novelty qualities they hold compared to other pecially when many answers are provided, the reader may not want
answers. Specifically, we aim at ranking answers by the amount of to sift through all answers but to read only the top ones. While a few
new aspects they introduce with respect to higher ranked answers works did address the task of answer ranking in CQA [22, 42, 45],
(novelty), on top of their relevance estimation. This approach is they considered mainly the qualitative notions of each answer sep-
common in Web search and information retrieval, yet it was not arately, its relevance to the question or whether it should be marked
addressed within the CQA settings before, which is quite different as best answer. These works did not address the overall quality of
from classic document retrieval. We propose a novel answer rank- the ranked list of answers. Especially, they did not consider the
ing algorithm that borrows ideas from aspect ranking and multi- complementary information provided by different answers.
document summarization, but adapts them to our scenario. An- In this paper we follow diversification approaches in Web search
swers are ranked in a greedy manner, taking into account their rel- and Information Retrieval (IR) [13, 14] and promote CQA answers
evance to the question as well as their novelty compared to higher not only by their relevance to the question but also by diversifi-
ranked answers and their coverage of important aspects. An exper- cation and novelty qualities they hold compared to other answers.
iment over a collection of Health questions, using a manually an- Specifically, assuming the information need behind a CQA ques-
notated gold-standard dataset, shows that considering novelty for tion can be partitioned into relevant “subtopics” or “aspects”, our
answer ranking improves the quality of the ranked answer list. goal is to rank the corresponding answers not only by their rele-
vance but also by the amount of aspects they cover (diversity), and
Keywords: Community-based question answering; Novelty; Diversifica- more specifically, by the amount of new aspects they introduce with
tion respect to higher ranked answers (novelty). Though diversification
of CQA answers for input questions was considered before, it was
1. INTRODUCTION under an IR setup, where results are retrieved from a large collec-
Community-based Question Answering (CQA), a platform in tion of answers [1]. To the best of our knowledge, this is the first
which people can ask questions and receive answers from other time this task is addressed under the CQA setup, where the dozen
people, has become a useful tool for information needs that are not or so answers to be ranked are manually provided by answerers
answered simply by viewing a Web page, including recommenda- directly for the target question.
tions, suggestions and homework help [26]. In popular CQA sites, There is a large body of research on document novelty and diver-
such as Yahoo Answers, Baidu Zhidao, Answers.com, and Stack sification for Web Search and IR [13, 4, 2, 33, 12, 43]. Yet, under a
CQA setup this task bears significantly different traits. First, since
Permission to make digital or hard copies of all or part of this work for personal or
the answers are provided by humans in direct response to the given
classroom use is granted without fee provided that copies are not made or distributed question, most of these answers are relevant to the question to some
for profit or commercial advantage and that copies bear this notice and the full cita- extent [3]. This is a very different scenario compared to document
tion on the first page. Copyrights for components of this work owned by others than retrieval, in which only a handful of documents are relevant out of
ACM must be honored. Abstracting with credit is permitted. To copy otherwise, or re-
publish, to post on servers or to redistribute to lists, requires prior specific permission a large list of matching documents. As a result, IR methods that
and/or a fee. Request permissions from permissions@acm.org. incorporate novelty detection on top of relevance estimation (e.g.,
SIGIR ’16, July 17–21, 2016, Pisa, Italy. MMR [8]) are somewhat unfitted for the CQA scenario (see Sec-
c 2016 ACM. ISBN 978-1-4503-4069-4/16/07. . . $15.00 tion 4). Second, CQA answers are typically much shorter than Web
DOI: http://dx.doi.org/10.1145/2911451.2911506

215

, documents, and are therefore more condensed in terms of the con- 2. RELATED WORK
tent they provide. Third, IR methods aim at short ambiguous Web Novelty-based answer ranking in CQA has not attracted much
queries as input while CQA questions are longer and more detailed. attention so far. However, it is related to a several research fields.
Another task that our CQA scenario resembles is that of sum- In this section we review the most related ones.
marizing different news articles on the same event [30]. In this
scenario all news articles (answers) are “relevant”, describing the 2.1 Answer Ranking
same event (question), but may provide different views and facts Answer ranking is essential for CQA services due to the high
(aspects) on the event by different reporters (answerers). The news variance in the quality of answers. Several previous studies dealt
articles should be summarized to provide a comprehensive view with answer ranking for CQA sites [21, 22, 6, 27, 42, 45]. Jeon et
about the event. Specifically, in query-focused summarization, a al. [21] predicted answer quality using non-textual features of the
typical approach is to rank sentences based on their similarity to answers. Bian et al. [6] integrated answer similarity to the ques-
the query and then cluster them and pick representative sentences tion with community feedback information. Jurczyk and Agichtein
based on the clusters [25]. [22] measured user expertise using link analysis over the question-
While drawing similarities between news summarization and our answers graph, assuming answers given by authoritative users tend
task, the final goal is quite different, since we do not need to pro- to be of high quality. Tu et al. [42] proposed an analogical reasoning-
vide a summarization of the answers but to rank them. Further- based method by measuring how valuable an answer is given its
more, news articles are longer and well structured. This is not the similarity to the set of best answers of similar resolved questions.
case in CQA answers, which are typically short with many empty Zhou et al. [45] additionally exploited three categories of user pro-
connecting unstructured sentences. Most notably in our task, many file information – engagement-related, authority-related and level-
aspects may be clamped together in a single sentence, which makes related, for answer ranking in CQA.
the typical approach of looking at a sentence as an atomic text unit Other works estimated the likelihood of an answer to be selected
inappropriate. As an example, consider the question “Whats your as best answer by the asker; an estimate that might be further used
best migraine cure?” and the provided answers “1) Excedrine mi- for answer ranking. Liu et. al. [27] and Shah and Pomerantz [36]
graine, phenergan, dark room, cold compress, 8 hours of sleep”, trained a classifier that predicts this likelihood based on features
and “2) Take medicine, go in a dark room and sleep for at least describing the question text, category, question-answer similarity,
an hour, it helps to use earplugs”. These answers contain several user reputation, and user feedback. Suryanto et al. [40] studied
complementary suggestions each and they share several discussed additional features derived from the answerer expertise. The au-
topics: sleeping, being in a dark room, and medicine taking. thors argued that an answerer can have different expertise levels
The method we propose for novelty-based answer ranking looks for different topics which should be taken into account during an-
at syntactic propositions instead of sentences as the atomic text swer quality estimation. Dalip et al. [16] proposed a learning to
units. Under this view, answer 2 in the example above is decom- rank approach using a comprehensive set of eight different groups
posed into “2.1) Take medicine”, “2.2) go in a dark room and sleep of features derived from the question-answer pair.
for at least an hour” and “2.3) it helps to use earplugs”. We then In this work we follow up on relevance estimation in prior work
measure the similarity between propositions and generate a hierar- and combine it with novelty detection. We leave the integration
chical clustering of them across all answers. Finally, answers are with non-textual features such as answerer reputation and user feed-
ranked in a greedy manner based on the amount of diverse proposi- back (e.g. votes) for future work.
tions they contain, taking into account each proposition’s relevance
to the question as well as its dissimilarity to propositions in higher 2.2 Document Diversification
ranked answers.
Novelty detection and search result diversification is an impor-
We tested our algorithm on a collection of health-related ques-
tant research track in IR. Carbonel and Goldstein [8] proposed the
tions and their answers from Yahoo Answers, which were manually
Maximal Marginal Relevance (MMR) approach, in which docu-
annotated with gold-standard aspects in each answer. We used con-
ments in the result list are evaluated based on their relevance to
ventional diversity-based IR evaluation metrics and also propose
the query as well as their difference from previously selected doc-
two novel evaluation metrics that better emphasize novelty under
uments. Zhai et al. [44] studied how an interactive retrieval system
our settings. We compare our approach with several state-of-the-
can best support a user that gathers information about the differ-
art novelty-based ranking algorithms and show that our algorithm
ent aspects of a topic. Agrawal et al. [4] selected the next docu-
outperforms prior work under all metrics.
ment that best matches DMOZ categories that are related to a Web
Our main contributions in this paper are:
query but are not well covered by higher ranked documents. The
• Introducing the task of novelty-based answer ranking under xQuAD algorithm [35] measures document diversity by its rele-
the CQA setup vance to sub-queries of an ambiguous Web query, which are not
related to previously ranked documents. In the heart of xQuAD, a
• Novelty-base answer ranking algorithm for CQA, which con- set of sub-queries that describe the different aspects of the query
siders novelty on top of relevance are assessed against each retrieved document. The sub-queries are
generated using query expansion techniques or using query refor-
• Manually annotated dataset1 of CQA questions together with mulations from search engines. Croft and Dang [15] suggested to
gold-standard aspects per answer use surface terms instead of topics as indications of aspects. They
• New evaluation metrics that emphasize diversification and identify candidate terms as words in the vicinity of query terms that
novelty in CQA answer ranking appear in retrieved documents and weigh their topicality and pro-
ductiveness. However, in the CQA setup, question terms, which ex-
press information need, often do not appear in the answers, which
express solutions. Therefore, a different aspect identification ap-
proach is required. We emphasize that diversification approaches
1
Available at http://webscope.sandbox.yahoo.com for short ambiguous Web queries, whose aspects can be modeled

216

The benefits of buying summaries with Stuvia:

Guaranteed quality through customer reviews

Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.

Quick and easy check-out

You can quickly pay through credit card or Stuvia-credit for the summaries. There is no membership needed.

Focus on what matters

Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!

Frequently asked questions

What do I get when I buy this document?

You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.

Satisfaction guarantee: how does it work?

Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.

Who am I buying these notes from?

Stuvia is a marketplace, so you are not buying this document from us, but from seller TIFFACADEMICS. Stuvia facilitates payment to the seller.

Will I be stuck with a subscription?

No, you only buy these notes for $14.49. You're not tied to anything after your purchase.

Can Stuvia be trusted?

4.6 stars on Google & Trustpilot (+1000 reviews)

79751 documents were sold in the last 30 days

Founded in 2010, the go-to place to buy study notes for 14 years now

Start selling

Popular Universities in the United States

Popular books

Find notes and summaries for these qualifications

Exam (elaborations)

Novelty based Ranking of Human Answers for Community Questions

Document information

Subjects

Written for

Seller

Reviews received

Content preview