MarkUs Blog

MarkUs Developers Blog About Their Project

Markus Plagiarism

without comments

The two students of Morgan Magnin and Guillaume Moreau, both teachers
of the computer science department. I am in charge of helping the students with
the technical aspects of Markus.

The Markus Plagiarism Project

Plagiarism can be defined as trying to pass off (parts of) documents (text,
source codes…) written by someone else as one’s own (ie, without indicating
which parts are copied from which author). Plagiarism occurs often in academic
environments. Students then intentionally or unintentionally include sources
in their work without a proper reference. Manual detection of plagiarism in a
collection of hundreds of student submissions is infeasible and uneconomical.

As far as Markus allows correcting assessments, it could be interesting to
know when a student’s work is plagiarized from another student’s work.

That is where the purpose of this project comes: integrating a tool to detect
plagiarism in the project available on the Markus interface. Some softwares
already exist. The students mission is then to develop or to see how to
integrate anti-plagiarism tools in Markus

Organization of the project

The way Ecole Centrale de Nantes manages students projects is a bit different
from the canadian format.

In particular, students are expected to write a document before beginning the
project with the specifications of this project, and to describe all the steps
of the project. You can find an example on the <a Sections implementation :

That also means each project has a specific goal, and should be ended by the
end of the term.

For Markus Plagiarism, the first step Benjamin and Shion are taking is to have
a look at ways to detect plagiarism, and existing solutions. The second step
will be to decide on a solution, then to start implementing it.

Possible Solutions

Ideas of criteria for plagiarism

Here are a few ideas Benjamin and Shion found plagiarims on source code.
Some of them were provided by Guillaume Moreau, and where mentionned in a
previous post:

  • Compare name or/and the number of classes and functions
  • Compare the structure of the scheme of classes (number of classes and
    associated functions and attributes)
  • Compare a randomly chosen sequence (a few lines) of the code, forgetting
    about the layout.
  • Compare the order of all variables used, excluding names often used such as
    i, j, temp…
  • Compare the order of elements (classes, functions, variables)
  • Compare the comments and also check homogeneity of spell mistakes all
    through the comments
  • Check for the homogeneity of coding styles through the entire code.
  • When tests are provided with the source code, test the compilation of the
    code. If the code does not compile, there is a high probability of


Different criteria can be more or less relevant depending on the size, the
complexity and the language used. Also, teachers may provide code to students
for an assignment. The two solutions proposed by Benjamin and Shion to counter
these issues are the following:

  • display a probability rate, or a score instead of a boolean result
  • the teacher would set some parameters depending on the assignment (like the
    code provided)

Benjamin and Shion also raised the possibility of performance issues. I’m not
too afraid of that, as the process can run at night, or at an hour where
students are not supposed to work (nor teachers).

I will soon post more informations on the solutions Benjamin and Shion have
found and studied !

Written by nvaroqua

February 1st, 2011 at 7:14 pm

Posted in Uncategorized

Leave a Reply