This week, I continued working on the visualizations for displaying the results of analyses using the plotly.js library. Previously, I have already implemented bar charts for illustrating the percentile scores of each performance record. The bar chart looks fine but I was not very satisfied with it because I feel that a user with little background knowledge in statistics might not find it interesting at all. I decided to explore plotly.js library a little further to see if there are better options. Then, I thought of an approach of presenting percentile score in a more intuitive manner by using histograms.  I spent a few hours coding up the javascript required for this, and the result looks much more appealing than plain bar charts.

However, I also quickly realized a downside of this approach. By using a histogram, I am revealing a lot more information (the averaged judge score) to the user. Dr Shawn mentioned to me before that it is best not to reveal information regarding judge scores to the user. For now, I am leaving both implementations (bar chart approach and histogram approach) available in the repository.

I also used the same histogram API for plotting the distribution of average scores by prize, average scores by venue, average scores by prize, average scores by event and average scores by instrument. These histograms are not targeted to a specific group of audience, I thought it would be good to add it into the system to help people understand the nature of the competition in a better manner.

Another visualization that I added this week are a set of grouped bar charts for visualizing the precision and recall scores of judges from each event. The result looks fine (visually), but I was not very satisfied because I could not find many instances where two judges from the same event could be differentiated by comparing their precision and recall scores. In many cases, judge A has higher precision than judge B, but judge B has better recall than judge A. This made me realize a weakness in the precision and recall metric. After spending some time thinking about it, I decided to add another new metric. The newly added metric is the widely used root mean squared error (RMSE). I thought this would be a useful metric for differentiating judges because it is able to penalize judges based on how worse their initial prediction is. For instance, a bronze-prediction (giving a score belonging to the bronze ‘range’) given to a gold-winning performance should be penalized much more than a silver-prediction given to a gold-winning performance. This is something that is not achievable using precision and recall values.

On Friday, I started working on my Github wiki documentation. I am feeling a bit nervy right now because there is a lot in my mind that I would like to cover in the wiki but I am not sure if I will be able to finish writing them. I hope I am able to finish it in this upcoming week.

Categories: Experiential