Automated Content Analysis: A Case
Study of Computer Science Student Summaries
A 1 Situating Automated
b Content Analysis in
s Higher Education
t
Our present concerns are about
r
CS students hav- ing difficulty
a
summarizing or synthesizing
c
texts accurately. Instead of
t
staying focused, some tend to
Technology is transforming wander away from significant
Higher Education learning points in writ- ten reports. There
and teaching. This paper are also issues relating to CS
reports on a project to instructors wasting valuable time
examine how and why on badly writ- ten reports,
automated content analysis especially in cases when class
could be used to assess pre
sizes are very large (with 150 to
´cis writing by university
students. We examine the 250 students). This often results
case of one hundred and in students not receiving
twenty-two sum- maries meaning- ful feedback that could
written by computer science help them to advance their
freshmen. The texts, which
had been hand scored us-
ing a teacher-designed
rubric, were autoscored
using the Natural Language
Processing soft- ware,
PyrEval. Pearsons
correlation coeffi- cient and
Spearman rank correlation
were used to analyze the
relationship between the
teacher score and the
PyrEval score for each sum-
mary. Three content models
automatically constructed
by PyrEval from different
sets of human reference
summaries led to consistent
correlations, showing that
the approach is reli- able.
Also observed was that, in
cases where the focus of
student assessment centers
on formative feedback,
categorizing the PyrEval
scores by examining the 264
average and standard
learning. Increasing the
deviations could lead to
novel interpretations of availability and quality of timely
their relationships. It is feedback could significantly
suggested that this project improve stu- dents’ written-
has implications for the communication skills.
ways in which automated The focus of this study is to
content analysis could be investigate how PyrEval (Gao et
used to help university
al., 2018a), an existing summary
students improve their
summa- rization skills. content analysis software tool,
might be used to automate the
,assessment of student summaries, particular, we are inter- ested in Bostock, 1998; Brockbank and
given a small set of reference exploring how PyrEval might be McGill, 2007; Tess, 2013). In this
summaries from which to used for formative, rather than approach, students create
construct a content model. Scores summative, assessment of student knowledge by connect- ing what
from an earlier implementation of work. With this view, the they already know to new subject
automated pyramid scoring were discussions here focus on con- tent encountered in lectures,
shown to have high Pearson PyrEval as a tool for helping texts and discussions. This shift
correla- tion of 0.83 with a main students to improve written in paradigm, from one where the
ideas rubric applied to 120 assignments prior to submission, learner retrieves information
community college summaries thereby making the time from the instructors, has
(Passonneau et al., 2016); on the instructors spend mark- ing more prompted recently coined phrases
same summaries PyrEval has beneficial. such as, self- directed learning
even higher correlation of 0.87. Learning in HE, often (Hiemestra, 1994) and student-
As such, the aim is not to described as con- structivist, centered learning (Lea et al.,
examine its correctness here; involves learners actively 2003). Unfortu- nately, assessing
instead, we seek to understand construct- ing knowledge and students’ self-directed learning,
how it could be adapted for use meaning based on prior ex- and providing formative feedback
within Higher Education (HE). In periences (Barr and Tagg, 1995; in this learning
Proceedings of the Thirteenth Workshop on Innovative Use of NLP for Building Educational Applications, pages 264–272
New Orleans, Louisiana, June 5, 2018. Ⓧc 2018 Association for Computational Linguistics
, approach, has not developed as rapidly. Section 6 discusses the benefits and limitations of
Feedback is intended to provide students with the automated tool,
information on their current state of learning and and our plans for future work.
performance, and is essential for elevating stu-
dents’ motivation and confidence (Hyland, 2000). 2 Related Work
Rather than being an evaluation of performance Summarization is an important pedagogical tool
on assigned tasks, formative feedback provides for teaching reading and writing strategies
in- formation to help students scaffold their in elementary school (Kırmızı, 2009), middle
knowl- edge and accelerate their learning (Sadler, school (Graham and Perin, 2007), community
2010). Therefore, formative assessment col- lege (Perin et al., 2013), as part of blended
applications play an important role by helping instruc- tional methods at the college level
students take greater control of their own (Yang, 2014), and for English language learners
learning, and moves them to- wards becoming (Babinski et al., 2017). Instruction in
self-regulated learners. summarization strategies includes occasional
Within HE, formative feedback is perceived as forays into computer-based training (Sung et al.,
information communicated to the students about 2008), including intelli- gent tutoring systems
learning-oriented assignments (Race, 2001) such that provide writing practice (Proske et al.,
as essays. This feedback can be oral or written, 2012)(Roscoe et al., 2015).
and is often generated by the instructor. Provid- Recent work built regression models to pre-
ing feedback remains the responsibility of the in- dict scores based on several rubrics for
structor, and with much emphasis being placed summaries from L2 business school students
on evaluating student learning at the end of an in- (Sladoljev Age-
structional unit, instructor feedback is often lim- ˇ
ited. Some even use custom software, such as E- jev and S najder, 2017). Features were automati-
rater R , used by the Educational Testing Service cally derived from Coh-Metrix (McNamara et
Ⓧ
for automated scoring of essays, which provides a al., 2014), BLEU scores (Papineni et al., 2002)
holistic score rather than a narrative. Our present and ROUGE scores (Lin, 2004). In (Srihari et
concerns move beyond simply providing a score al., 2008), OCR was used to digitize handwritten
to examine how and why PyrEval could be used es- says, which were then scored using various
to provide formative feedback on students’ sum- au- tomated essay scoring methods, including
maries. It is distinctive in providing interpretable latent semantic analysis and a feature-based
scores that can be justified by automated identifi- approach. Essays are automatically scored in
cation of important, unimportant and missing (Zupanc and Bosni, 2017) after constructing an
con- tent (Passonneau et al., 2016). This study ontology from model essays using information
provides a conceptualization for the next steps in extraction and logic reasoning. PyrEval
the devel- opment of the tool towards this end. constructs a content model from a small set of
reference summaries, using latent semantic
The next three sections present the following:
vectors to represent mean- ings of phrases.
background to the study through a review of ex-
isting literature; a summarization task given to There has been recent interest in developing
CS students at a UK university along with a au- tomated revision tools for students’ written
descrip- tion of how it was assessed by the work but none have, hitherto, been reported in
instructor, one of the authors PyrEval, an the lit- erature. There is existing work on
automated tool to ana- lyze content of summaries automated re- vision of short answers for middle
that depends on a refer- ence set of four or more school science writing (Tansomboon et al.,
expert summaries. 2017), and a corpus on automated revision of
argumentation (Zhang et al., 2017). What is
Section 5 presents our experiments to com- distinctive about our work is the feasibility of
pare PyrEval scores of the students’ summaries providing automated feedback on summary
with scores assigned by the human scorer using a content, either for teachers or students, which
rubric. The findings show that PyrEval scores could ultimately lead to the development of an
cor- relate moderately well with the rubric, but automated revision tool.
more importantly, the analysis led to
reconsideration of scores for several summaries. 3 Task and Educational Rubrics
2
The benefits of buying summaries with Stuvia:
Guaranteed quality through customer reviews
Stuvia customers have reviewed more than 700,000 summaries. This how you know that you are buying the best documents.
Quick and easy check-out
You can quickly pay through EFT, credit card or Stuvia-credit for the summaries. There is no membership needed.
Focus on what matters
Your fellow students write the study notes themselves, which is why the documents are always reliable and up-to-date. This ensures you quickly get to the core!
Frequently asked questions
What do I get when I buy this document?
You get a PDF, available immediately after your purchase. The purchased document is accessible anytime, anywhere and indefinitely through your profile.
Satisfaction guarantee: how does it work?
Our satisfaction guarantee ensures that you always find a study document that suits you well. You fill out a form, and our customer service team takes care of the rest.
Who am I buying this summary from?
Stuvia is a marketplace, so you are not buying this document from us, but from seller TGUARD. Stuvia facilitates payment to the seller.
Will I be stuck with a subscription?
No, you only buy this summary for R81,91. You're not tied to anything after your purchase.