NotesFAQContact Us
Collection
Advanced
Search Tips
Back to results
Peer reviewed Peer reviewed
Direct linkDirect link
ERIC Number: EJ1303381
Record Type: Journal
Publication Date: 2021
Pages: 25
Abstractor: As Provided
ISBN: N/A
ISSN: EISSN-2365-9440
EISSN: N/A
Paraphrase Type Identification for Plagiarism Detection Using Contexts and Word Embeddings
Alvi, Faisal; Stevenson, Mark; Clough, Paul
International Journal of Educational Technology in Higher Education, v18 Article 42 2021
Paraphrase types have been proposed by researchers as the paraphrasing mechanisms underlying acts of plagiarism. Synonymous substitution, word reordering and insertion/deletion have been identified as some of the common paraphrasing strategies used by plagiarists. However, similarity reports generated by most plagiarism detection systems provide a similarity score and produce matching sections of text with their possible sources. In this research we propose methods to identify two important paraphrase types -- synonymous substitution and word reordering in paraphrased, plagiarised sentence pairs. We propose a three staged approach that uses context matching and pretrained word embeddings for identifying synonymous substitution and word reordering. Our proposed approach indicates that the use of Smith Waterman Algorithm for Plagiarism Detection and ConceptNet Numberbatch pretrained word embeddings produces the best performance in terms of F1 scores. This research can be used to complement similarity reports generated by currently available plagiarism detection systems by incorporating methods to identify paraphrase types for plagiarism detection.
BioMed Central, Ltd. Available from: Springer Nature. 233 Spring Street, New York, NY 10013. Tel: 800-777-4643; Tel: 212-460-1500; Fax: 212-348-4505; e-mail: customerservice@springernature.com; Web site: https://bibliotheek.ehb.be:2129/gp/biomedical-sciences
Publication Type: Journal Articles; Reports - Research
Education Level: N/A
Audience: N/A
Language: English
Sponsor: N/A
Authoring Institution: N/A
Grant or Contract Numbers: N/A