While many of our statistics reflect
the number of transcriptions someone completed, it is much more
important that they are well made. Or as we often say: "Quality over quantity".
how can we make sure that this is actually the case? With such a large
volume of transcriptions it would be impossible to review all of them.
In this post we look at the measures we take and how the process has
been drastically improved during the last few months.
Welcoming New Volunteers
first thing we do is welcome all new volunteers. A mod from the welcome
team will send a friendly message to the new user and give them
feedback on their very first transcription.
will already solve the most pressing issues, like using the Fancy Pants
editor instead of Markdown mode. In fact, it was our only method of
quality assurance for a long time. But is it enough? Not really.
Considering the big number of different templates and post types, it is
easy to make mistakes even after the initial feedback.
deploying the new bot system, we gained more options to control the
transcription quality. We added a system that randomly submits new
transcriptions for review, based on the experience of the volunteer.
the inception of the checks on June 5th, we have already reviewed over
2,200 transcriptions. This system has proven to be very effective. For
new transcribers, it makes sure that they get familiar with the
templates and understand the quirks of Reddit Markdown. But even for
experienced transcribers it unveiled some issues, such as using old
templates that have been updated recently.
Automatic Formatting Checks
became clear that a lot of issues are very common and appear in many
transcriptions. Unfortunately some of them, like not escaping a Reddit
username or accidentally making a heading instead of a separator, are
not visible on all clients without checking the Markdown source.
Therefore, it was a lot of work for the Welcome and User mods to check
for these things and to ask transcribers to fix them again and again.
created the idea of automated formatting checks. Why review everything
manually if the bot can do it for you? The Development team (with
contributions by u/--B_L_A_N_K--) added automatic detection for a lot of common issues:
Forgetting the header
Making the header bold instead of italic
Forgetting the separators after the header and before the footer
Using a wrong footer
Accidentally making a heading instead of a separator by forgetting the empty line before it
Accidentally making a heading instead of a hashtag by forgetting the backslash escape
Using a fenced code block instead of four spaces before each line
This helped reduce the workload for the other mod teams and also had another nice side effect: Because the
response is rejected before the formatting has been fixed, the
transcriptions with formatting errors won’t enter our database.
Unfortunately we can’t check for edits on each transcription, so the
more issues we can detect before a
done gets accepted, the better.
Of course, not all issues can be detected automatically, so the welcoming and manual reviews are still important.
Treasure Hunt Reviews
but not least, the Engagement team also contributes to the quality
assurance. Every treasure hunt entry is manually checked for accuracy,
including formatting issues, duplicate submissions and use of an
incorrect template. Because we get more than 100 transcriptions
submitted each tenday, this is also a considerable chunk of work,
especially considering that the Engagement team is one of the smallest
And that's it for this
monthly meta! We plan to further improve our quality measures in the
future. Do you have any ideas? Please let us know!