Archive for the ‘Developer Essentials’ Category
- revised the proposed schema for database definitions upon further research/readings
- implemented two migration files for creating Tags Model and a join table for the has_many_and_belongs_to relationships between tags and groupings
- implemented some code to retrieve tags for each grouping when the submissions table is loaded (hasn’t been tested yet)
- sync up with David/Bryan about my migration files
- implement code to create tags (need to sync with Bryan exactly what information the user will enter for each tag i.e. name, description) etc.
- once code to create tags is implemented, test creating tags and retrieving when submissions table loaded
- revised Dashboard plan based on feedback from Karen and David
- sketched three versions of Dashboard wireframes and chose one with feedback from David
- started implementing the changes (phase one: separate the view with list on the left that controls the details on the right)
- continue implementation of new design
- commit phase one
- I posted a couple of issues, #1840 and #1841, that both deal with adding consistency to the application
- I have been working on converting the rake Assignment tests over. There are many!
- I hope to have the majority of the rake Assignment test both reviewed and ported over by the end of the week.
- I was able to get the PDF.js viewer is in MarkUs, and can view, and control zoom on PDFs.
- Finish some implementation details and cleanup code with the viewer.
- Fix styling problems with the viewer in MarkUs.
- Fix deprecation warning according to server log from running rspec test.(Relation#all, Relation#first warnings etc),
but there are still many deprecation warnings and errors when I run rake test.
- Fix the error when log in as admin.
- Eliminate more deprecation warnings and errors.
- Modified the collection process so that PDF files are not converted to images.
- Worked on integrating the full PDF.JS viewer into the grading page.
- Wasn’t able to work on the CSV upload problem because the VM still needs to be configured fully.
- Complete integration of the viewer into the page.
- Plan and start implementing annotations.
- Upgraded rails to 4.0 before upgrading to 4.1.
- Wrapped the conditions, order in lambda. The syntax we used before caused errors after upgrading to rails 4
- An error related to Active Record appears When log in as admin. The error disappeared when i switched back to rails 3.
Fix Deprecation warnings and errors.
- pushed rspec tests for the Assignment model’s associations and attributes
- Learned more rspec for the next tests I will write.
- Test the Assignment model methods and carry over existing tests from rake test.
- Researched and created a blog post outlining the feasibility of PDF.js
- Made some prototypes to prove PDF.js could work for annotations.
- Not much documentation outlining the actual PDF.js API so a lot of reverse engineering of the code needed to be done.
- Start implementing PDF.js annotation system.
- Continue work on CSV upload problem.
- Submitted pull request for Mark model RSpec test.
- Researched and read some documentation about upgrading from rails 3.2 to rails 4, made a small to do list.
- Start working on the upgrade by following checklist in the upgrade guide.
- Find new issue to fix.
Current Problems with PDF Viewing and Annotations
- Collecting all submissions is slow because all the PDFs need to be converted into images first. This is a slow, expensive, and fragile operation.
- Images are not resized and display at their native resolution. This can cause a poor user experience on smaller screens forcing the user to scroll around the page in order to view the entirety of the image.
- Annotations don’t always seem as perfectly fluid as the screen is scrolled.
After research the state of online PDF viewers there does not appear to be many free options that would allow us to do all of the things we need to be able to do in a PDF. There is one library that stands out called PDF.js. It solves many but not all of these issues for us.
How PDF.js Can Help
- There is no longer a need to do any type of pre-computation or conversion in order to view the PDFs. This cuts down significantly on the collection time of assignments that are mainly composed of PDF documents. As well as completely removes the fragile conversion process.
- The library has the ability to control the zoom level and rotation of the document. Thus allowing large documents to be viewed easily on most devices.
- It can print a document if need be.
- As well as supports the native navigation options in PDF files. Such as paging, page previews, and the document outline (i.e. navigating through a table of contents).
All of the above features are supported natively, however the one major feature for us the library is missing is support for adding annotations to an existing document. Which is a core feature of the current system. This is something that would need to be implemented by hand as there are no existing solutions for this library or any other free online system.
PDF.js does not support creating annotations on a document natively. It can display annotations that are already embedded in the PDF but cannot create new ones. I did investigate trying to use the native annotation system however there is not much documentation about how it actually is used so the code would need to be partially reverse engineered to determined exactly what it is doing. Using the native annotation system would be one option. The other could be achieved using the framework that the PDF.js library provides during its normal rendering process.
- The position and size of the annotations would be based off the current scale of the document when the annotation was added. All annotation size and positions would then be normalized so that when storing them in the database all coordinates are based off the document being at a 100% scale.
- The main hurdle with this system is keeping the position of the annotations in line with the document as it is zoomed and rotated. However this is not impossible and there are hooks into the API that would allow for this to be managed. Using `PDFView.currentScale` we can get the current scale of the document and adjust the annotation scaling and position to the document as the zoom level changes. The same technique could then be applied for rotation.
I believe using PDF.js is a viable option that could provide a superior benefit to the graders marking PDF files. It provides a much more natural experience viewing and annotating PDF files. As well as has the added benefit of removing the slow and fragile conversion process.
- Attended the Code Sprint in Toronto!
- Submitted pull request for rspec tests.
- Worked on the CSV large file upload issue (#1766).
- Started researching pdf annotation improvements.
- We were going to try and setup a VM similar to the production server so that the CSV issue and be replicated and logged. When that is complete I will continue trying to resolve this issue.
- Write a blog post regarding the state of web annotations on PDF’s.
- Create prototypes of the PDF annotation feature with PDF.js to determine the feasibility of a new annotation system.
- Attended Code Sprint
- Fixed issue #1768 (API for adding users doesn’t work for REMOTE_USER_AUTH)
- Fixed issue #1718 (Submissions table missing asset)
- In the process of becoming an rspec master.
- Started translating the assignment rake test to an rspec test.
- Also thoroughly going though the assignment model logic itself; added association options and attribute contraints
- Attended Code Sprint
- Fixed issue #1714 (“Reset API Key” button causes help text to show )
- Fixed issue #1655 (Adding two flexible criteria *WITH SAME NAME* silently fails)
- Started Mark model rspec test
- Continue working on Mark model rspec test
- Make a plan for what need to be changed for the upgrade
- The development environment for the MarkUs Project was setup using Vagrant and Ubuntu 13.
- Submitted pull request #1750 (https://github.com/
MarkUsProject/Markus/pull/1750 ) for a fix to issue #1728 (https://github.com/ MarkUsProject/Markus/issues/ 1728). This issue was a bug that improperly formatted hyperlinks on the assignment dropdown for assignments marked as hidden. To test the fix, several new assignments were created and marked as hidden. The assignment sidebar worked properly for all these new assignments.
- There was some trouble setting up Vagrant and MarkUs on Windows. To get around this, Ubuntu was installed and MarkUs was set up on this environment instead.
- Getting ready for the UCOSP code-sprint in Toronto! Working on new issues and fixing bugs.
- Setup a stable and usable virtual machine running to develop the application in. I had Vagrant running almost immediately and MarkUs shortly there after. However I then noticed that while the application was running in Vagrant there were very long page load times (20+ Seconds). Instead I imported the the VirtualBox machine into Parallels. That improved page load times to about a 1 – 5 seconds on average.
- I fixed Issue #1493 and submitted a pull request. This was a problem with how totals were displayed and updated in the marks spreadsheet.
- I noticed the “total_grade” column is managed in ruby and am unsure why this value is being calculated server side and having to be managed manually when it could simply be a calculated SQL column. I have filed an issue (#1762) in regards to this.
- If Issue #1762 is approved I would like to fix that issue.
- Complete some more tasks/issues. Possibly some of the spreadsheet related items since I now have some experience working with that component in the application.
- Attend the Code Sprint!
- Set up development environment. There was trouble setting up MarkUs on Windows 8.1 so Ubuntu 14.04 was installed and it is now working fine.
- Familiarizing myself with Ruby and Git.
- Work on Issue #1759.
- Fix Issue #1759, test and submit pull request.
- Head to Code Sprint in Toronto!
- Fixed Issues #1746 & #1748. Pull requests have been accepted.
- Determine why some rake tests appear as failing for my environment. Potentially create a couple of issues — depending on their validity — and pick up more issues to tackle.
- Set up dev environment. Have a fix for issue 1730
- submit pull request for my fix
- Set up development environment.
- Read development guideline and other documentation of Markus
- Submit pull request of issue-1735
- More reading and prepare for the Sprint.
- Wait for feedback of the pull request.
- Setting up the development environment. Had some issues dual booting Linux Mint 17 and Windows 8.1 and had to format/repartition my poor laptop 4-5 times (all is well now).
- Reading up on Git and Ruby tutorials
- Reading developer guidelines for MarkUs
- Continue getting familiar with Ruby, Rails and MarkUs
- Get assigned an issue to work on
- Attend Code Sprint!
Alex & Mark
- Solve issue #1456 which is related to SVN repo creation issues (which is a good primer to our main project listed for next week)
- Continue our investigation into how Git has been integrated so far into Markus on the Git branch and assess the contributed code for quality before adding more.
There was a lot of work done on MarkUs this summer. (Thanks David, Lawrence, Eugene, Ealona, Su, Mark, and Angelo!) The strange thing about getting lots done is that it is more obvious what needs to be done next, so I have a long list of potential projects for the fall.
1. React tables for spreadsheets, repo browser, and refactoring #1696 (large)
This summer Lawrence made some major changes to how tables are implemented in MarkUs. The biggest change was to use the React JS library to implement the tables. This got rid of a lot of Prototype code, and made it possible to do better sorting and filtering on the tables. Two tables remain to be converted: the repo browser and the spreadsheet. There is also some refactoring work to do, now that we know better how React works. Also, there is some remaining styling work for the new tables.
2. Get rid of the rest of Prototype #1496 (small)
3. Git backend #1698 (large)
MarkUs stores student submitted files in subversion. Instructors have the option allowing students to submit files through the web interface, or may disable the web interface and require that students use subversion directly to submit their work. One repo is created per student or per group, if groups are allowed. Each assignment is a directory in a repo, and MarkUs tries to reuse repos where possible.
We are in the process of adding support to store student submissions in git repos, while at the same time maintaining support for subversion. Because git and subversion have quite different models, this task is more involved that it first appears. Much work has been done on this in the past; new students will spend some time reviewing the existing progress for this feature.
4. Refactor the Admin Dashboard #1668
Reloading the graphs takes too long and isn’t really necessary (medium)
The Dashboard view has graphs that show the mark distribution for each assignment. They take some time to load and aren’t really that useful to the administrators. It is nice to get some summary information, but we need to rethink what the best summary data is to display and how to do it. This will be partially a UI design project since we need to rethink the purpose of the dashboard view and what information is most useful to display.
5. Rails 4.0 upgrade (medium?)
Mark Rada began the work of upgrading from Rails 3 to Rails 4 this summer. He did some great work on strong parameters, but there are other parts of the upgrade process that still need work. This project would be a great one for students with some understanding of Rails, and have an interest in how the Rails framework works.
6. PDFs – is there a better library to use? (research: small; implementation: large)
Students may submit PDF files. MarkUs converts theses files to jpegs (using Ghostscript) so that the images are a fixed size to facilitate annotation. It is time to do some research to see if better options are available. The goal is to maintain support for annotating PDF files without needing to convert the files. Actually switching to a new PDF option will probably require multiple terms/students.
7. Rspec tests (large, but an ongoing effort)
This summer saw a major effort to change to Rspec tests. Ealona and Su wrote a guide to writing Rspec tests, and have done a fair bit of work implementing some Rspec tests. A goal of this term is to have everyone on the team write some Rspec tests. It will be a good way to really learn what the models and controllers are supposed to do, and will move the project forward. We plan to set aside a few hours at the sprint for writing tests. There are some outstanding Github issues related to missing tests that can easily be closed with some work.
8. Tagging student submissions #886 #325 (medium?)
This is a feature request that we hope will satisfy several use cases. Instructors and TAs have asked for a way to flag assignments to bring them to someone’s attention. TAs have also asked for a way to categorize the submissions so that they can do a quick first pass over them. The ability for adding general tags to student submissions will hopefully solve a number of these kinds of issues.
9. Summary page of all the marks for all students (smallish)
There is currently no view that combines marks from all of the different assignments and spreadsheets. This table would look a lot like the Submissions table or the Summaries table, but would have one column for each assignment and spreadsheet. A new feature would be a way for the administrator to specify a weighting for each piece of work to produce a total. (Good for someone not familiar with Rails.)
10 MathJax support for annotations? #285 (medium?)
It would be nice to be able to use math symbols in annotations. The MathJax library seems to be what we want, and some work has been done on this.
11. UX Refresh of the submissions table (Includes #75) (medium)
We haven’t taken a serious look at what is in the submission table for a long time. For example, we probably don’t need the “can begin grading” field. We would also like to be able to show the grader(s) for each group.
12. Section due dates don’t work #1676 (small)
Some courses would like to have a different due date for each section. This feature seems to have numerous problems with it. There is also a proposal to change the UI for how sections are added.
With the declarative power of Rails’ ActiveRecord, it’s very easy to write code that are prone to performance bottleneck, such as an issue commonly known as the n+1 query problem.
Consider the following simplified MarkUs models:
TaMembership is a join model between Ta and Grouping that represents the assignment of a TA to a grouping so that the TA has the permission to mark students in the grouping.
A view may have a table of groupings and need to display all the TAs assigned to each grouping (in ERB), as in this (overly-simplified) example:
This all looks good until we inspect the query log and find that such a simply snippet of code generated a lot of SQL queries:
This is because ActiveRecord lazily loads associations by default, i.e., the association model is only loaded (through a SQL query to the database) when the attribute is accessed the first time. In the above example, an initial query is issued to get a list of Grouping models. Then, for each Grouping instance, the
tas attribute is accessed, generating another SQL query to get the associated model instance. This results in a total of n+1 queries, where n is the number of groupings, hence the name the problem. In a networked production environment, the round-trip cost of issuing a database query is a significant overhead due to network delays. Therefore, in general, having n queries perform poorly compared to having only one (or some constant number of) queries that achieve the same.
In this case, the performance can be improved by avoiding the n+1 queries and use only a few. In Rails, this can be achieved using eager loading of associations.
By using includes, Rails takes care of the eager loading of the
tas association and issues only two queries:
Just like the
joins method, The
includes method can handle nested associations as well:
Note that while the nested association can be loaded, sometimes it’s redundant and causes too much overhead when the nesting level gets too deep. In the above snippet,
grouping.ta_memberships is a collection of
TaMembership instances, where each instance has a
ta association, and each one has a collection of
Grouping instances. All these eagerly-loaded instances already form a pretty large and complex structure — Large structure causes memory bottlenecks. Think about whether there is any redundancy in the eagerly loaded model instances and whether you can re-organize the view or controller to simplify the structure.
Normally the use cases of
includes are in the controllers, in which the model instances are eagerly loaded and passed to the view (the above examples where the
includes calls are in the view are only for demonstration purposes). However, when multiple controllers are using the same set of eagerly loaded associations, consider writing a scope for the model (in Rails 4):
Or the Rails 3 way:
MarkUs still has quite a few instances of the n+1 query problem at the time of writing. With the help of bullet, we can track the the remaining evil n+1 queries in the system.
Bulk operation — “n query problem”
The n+1 query problem occurs most likely when reading the database (i.e., doing SQL
SELECT statements). A related problem, I call it “the n query problem” may occur when doing bulk operations such as
For example, the problem occurs when doing something like the following:
This generates n
UPDATE statements and n
INSERT statements. These can normally be reduced to just one single query.
Use activerecord-import. This is by far the most DBMS-independent gem for ActiveRecord bulk creation. ActiveRecord’s
create method actually supports bulk creation, but it depends on the underlying database driver for Ruby to do the actual bulk creation. At the time of writing, the PostgreSQL driver pg still doesn’t support bulk creation.
For a sample usage, refer to Grouping.assign_tas.
Bulk update and bulk deletion
For a sample usage, refer to Grouping.unassign_tas.
The Upgrading to Rails 4 guide is a good first step to understanding what needs to be changed. Rails’ own Upgrading Guide also has a list of things to pay attention to. Neither guide is a superset of the other, and I have had to look elsewhere to solve a few of the upgrading issues.
Here is a short list of the upgrading tasks that I could not back port which caused varying amounts of trouble:
You will want to look more closely at the versions that the first guide suggests. In most cases a newer maintenance release of a gem are available and should be selected instead.
One exception to this is the sass-rails gem. It will need to be set to version 4.0.1. Why? Because MarkUs needs a newer version of Sass. Sass 3.3.x is required because Sass 3.2.x fails to parse MarkUs scss code. Though it does not appear to make sense, the older sass-rails gem had a looser version specification for sass, and so by rolling back sass-rails to 4.0.1 we allow bundler to choose a newer version of sass itself.
When removing strong_parameters from the Gemfile, do not add the protected_attributes gem for backwards compatibility. MarkUs has already been upgraded to use strong parameters. Since strong_parameters is part of Rails 4, it should not be listed in the Gemfile.
Lastly, the minitest gem should be removed. Ralis 4 depends on minitest itself; it is best to avoid conflicting version requirements.
The catch-all route should get changed to work via :all HTTP verbs if possible: