Teachers are turning to essay-grading software to critique student writing, but critics point out serious flaws in the technology
Jeff Pence knows the way that is best for his 7th grade English students to boost their writing is always to do a lot more of it. However with 140 students, it could take him at the least a couple of weeks to grade a batch of the essays.
And so the Canton, Ga., middle school teacher uses an online, automated essay-scoring program which allows students to have feedback to their writing before handing in their work.
“It doesn’t let them know how to proceed, however it points out where issues may exist,” said Mr. Pence, who says the a Pearson WriteToLearn program engages the students almost like a casino game.
A week and individualize instruction efficiently with the technology, he has been able to assign an essay. “I feel it’s pretty accurate,” Mr. Pence said. “can it be perfect? No. However when I reach that 67th essay, I’m not accurate that is real either. As a united team, we have been pretty good.”
With all the push for students to be better writers and meet up with the new Common Core State Standards, teachers are hopeful for new tools to greatly help out. Pearson, that is located in London and new york, is regarded as several companies upgrading its technology in this space, also known as artificial intelligence, AI, or machine-reading. New assessments to check deeper move and learning beyond multiple-choice answers are also fueling the demand for software to greatly help automate the scoring of open-ended questions.
Critics contend the software doesn’t do a whole lot more than count words and as a consequence can not replace human readers, so researchers are working difficult to improve the program algorithms and counter the naysayers.
Although the technology has been developed primarily by companies in proprietary settings, there’s been a new concentrate on improving it through open-source platforms. New players available in the market, such since the startup venture LightSide and edX, the nonprofit enterprise started by Harvard University and the Massachusetts Institute of Technology, are openly sharing their research. This past year, the William and Flora Hewlett Foundation sponsored an open-source competition to spur innovation in automated writing assessments that attracted commercial vendors and teams of scientists from about the entire world. (The Hewlett Foundation supports coverage of “deeper learning” issues in Education Week.)
“We are seeing plenty of collaboration among competitors and people,” said Michelle Barrett, the director of research systems and analysis for CTB/McGraw-Hill, which produces the Writing Roadmap for usage in grades 3-12. “This unprecedented collaboration is encouraging a whole lot of discussion and transparency.”
Mark D. Shermis, an education professor in the University of Akron, in Ohio, who supervised the Hewlett contest, said the meeting of top public and researchers that are commercial along with input from a number of fields, may help boost performance regarding the technology. The recommendation from the Hewlett trials is that the software that is automated used as a “second reader” to monitor the human readers’ performance or provide more information about writing, Mr. Shermis said.
“The technology can’t try everything, and nobody is claiming it may,” he said. “But it is a technology who has a promising future.”
The very first essay-scoring that is automated get back to the early 1970s, but there isn’t much progress made through to the 1990s with all the advent regarding the Internet as well as the capacity to store data on hard-disk drives, Mr. Shermis said. More recently, improvements have been made within the technology’s ability to evaluate language, grammar, mechanics, and style; detect plagiarism; and provide quantitative and qualitative feedback.
The computer programs assign grades to writing samples, sometimes on a scale of 1 to 6, in a variety of areas, from word choice to organization. These products give feedback to help students improve their writing. Others can grade short answers for content. The technology can be used in various ways on formative exercises or summative tests to save time and money.
The Educational Testing Service first used its e-rater automated-scoring engine for a high-stakes exam in 1999 when it comes to Graduate Management Admission Test, or GMAT, according to David Williamson, a senior research director for assessment innovation for the Princeton, N.J.-based company. It uses the technology in its Criterion Online Writing Evaluation Service for grades 4-12.
Over the years, the capabilities changed substantially, evolving from simple rule-based coding to more sophisticated software systems. And statistical techniques from computational linguists, natural language processing, and machine learning have helped develop better methods of identifying certain patterns written down.
But challenges stay in coming up with a definition that is universal of writing, as well as in training a computer to understand nuances such as for instance “voice.”
Over time, with larger sets of data, more experts can identify nuanced aspects of writing and improve the technology, said Mr. Williamson, that is encouraged because of the new era of openness about the research.
“It is a hot topic,” he said. “There are a lot of researchers and academia and industry looking at this, and that is the best thing.”
Along with utilising the technology to enhance writing in the classroom, West Virginia employs automated software for its statewide annual reading language arts assessments for grades 3-11. Their state spent some time working with CTB/McGraw-Hill to customize its product and train the engine, using tens and thousands of papers it has collected, to score the students’ writing according to a prompt that is specific.
“Our company is confident the scoring is very accurate,” said Sandra Foster, the lead coordinator of assessment and accountability within the West Virginia education office, who acknowledged facing skepticism initially from teachers. But some were won over, she said, after a comparability study showed that the accuracy of a teacher that is trained the scoring engine performed a lot better than two trained teachers. Training involved a hours that are few simple tips to assess the writing rubric. Plus, writing scores have gone up since implementing the technology.
Automated essay scoring can also be applied to the ACT Compass exams for community college placement, the newest Pearson General Educational Development tests for a high school equivalency diploma, along with other summative tests. Nonetheless it has not yet yet been embraced because of the College Board for the SAT or even the rival ACT college-entrance exams.
The two consortia delivering the new assessments under the most popular Core State Standards are reviewing machine-grading but have never focused on it.
Jeffrey Nellhaus, the director of policy, research, and design when it comes to Partnership for Assessment of Readiness for College and Careers, or PARCC, really wants to determine if the technology is supposed to be a fit that is good its assessment, as well as the consortium should be conducting a report based on writing from the first field test to see how the scoring engine performs.
Likewise, Tony Alpert, the chief operating officer for the Smarter Balanced Assessment Consortium, said his consortium will measure the technology carefully.
With his new company LightSide, in Pittsburgh, owner Elijah Mayfield said his data-driven approach to writing that is automated sets itself apart from other products available on the market.
“What we are attempting to do is build a method that instead of correcting errors, finds the strongest and weakest sections of the writing and where you can improve,” he said. “It is acting more as a revisionist than a textbook.”
The software that is new which will be available on an open-source platform, has been piloted this spring in districts in Pennsylvania and New York.
In higher education, edX has just introduced automated software to grade open-response questions for use by teachers and professors through its free online courses. “One for the challenges in the past was that the code and algorithms are not public. These were viewed as black magic,” said company President Anant Argawal, noting the technology is within an experimental stage. “With edX, we place the code into open source where you are able to see how it really is done to assist us improve it.”
Still, critics of essay-grading software, such as Les Perelman, want academic researchers to own broader usage of vendors’ products to guage their merit. Now retired, the former director of the MIT Writing Across the Curriculum program has studied a number of the devices and was able to get buy essay online a high score from one with an essay of gibberish.
“My principal interest is he said that it doesn’t work. Whilst the technology has some use that is limited grading short answers for content, it relies way too much on counting words and reading an essay requires a deeper standard of analysis best done by a person, contended Mr. Perelman.