Last week, I was almost done with implementing the functions to calculate the averaged standard scores and percentile scores. This was my second attempt on this, and I tried to make use to std::unordered_map to improve the time complexity of the calculation. Just when I have completed it, Dr Shawn took a look at my code and spotted an issue that I have yet to consider before. In my implementation of percentile scores calculation, I heavily relied on making complex queries involving join, groupby, as well as aggregate functions such as avg(). Therefore, this result in cases where I had to execute the query by directly providing an SQL string to the session.query() function in Wt::Dbo. Dr Shawn told me that this sort of implementation is highly vulnerable to SQL injection attacks. For example, suppose the SQL string provided has the form

session.query("SELECT id, age FROM players, absolutescores WHERE name = " + std::to_string(name) + " and ... ")

Then, all it takes is a corrupted name (eg. name = “; DROP table players; “) to wipe out all records in the players table. Dr Shawn also mentioned that the benefit of relying on the bind() function in Wt::Dbo to perform queries is that Wt::Dbo will help to sanitize the data before executing the query, and this provides a layer of protection against corrupted field values. Right after learning about this, I spent about a day re-implementing the percentile calculation function, making sure that I do not rely on any SQL strings to perform queries directly.

On Wednesday, Dr Shawn told me that, before proceeding to implementing more functions for calculation of other statistics, it is better for me to first implement the REST API for returning JSON object, as well as the front end codes for displaying visualizations on the browser. I think this might be a good idea too, because the sooner I can understand how the REST API and front end works together, the less uncertainty I am going to face when I structure my code in the next few weeks.

To be able to test the REST API that I was going to implement, I need to first be able to run CAS as a witty application on localhost. By referring to codes available in CRS and CMS, I added a few files to CAS:

  • cas.cpp which contains the main() function for initializing Wt::WServer and Wt::WResource objects
  • run.sh which is a shell script for calling the Makefile,
  • cas another shell script for running the binary file and monitoring the status of the server from terminal.
  • wt_config.xml which specifies the resources available and their corresponding API endpoints

With these, I was then able to run CAS as a witty application on localhost. I then proceed to implementing the REST API for returning JSON objects to the client. At first, I looked at the REST API implementation in CMS and noticed that the Poco::Json::Object was being used to manually set the key-value pairs. It seemed straightforward enough. All I need to do is, for every table in CAS database, implement a derived class of Wt::WResource that would handle querying from that specific table, except that all of them would look very similar to each other. This led me to consider writing a TableResource<T> template class instead. This is possible, because all of my querying functions in CASDatabase were implemented as template functions.

The tricky part, is the part where I have to set the key-value pairs of JSON object. I came up with 2 options for this,

  • Implement a returnPocoJsonObject() virtual function in every derived class of Table to handle setting the key-value pairs. Then, call this function from the TableResource<T> template class
  • The second option, is to use the Wt::Dbo::JsonSerializer in the Wt library to directly take a Wt::Dbo::ptr and serialize it as a JSON to be written into the response.out() stream

The second option seemed really tempting because it takes the least amount of code. I decided to test it out. However, I soon found out some limitations withWt::Dbo::JsonSerializer.

  • Wt’s serializer is only able to serialize field of with int type or std::string, even double is not supported! If I use this I’ll need to store doubles as integers in the CAS database. Although decimal point precision is not too critical for the analysis that I’m planning to use, I still think this should be avoided because it imposes a strong restriction on the database.
  • Wt::Dbo::JsonSerializer serializes every attribute of a Wt::Dbo::ptr object. When serializing an object that is involved in a One to Many relationship with another table, the ‘Many’ side which is a collection of objects will be serialized as an Array of Json objects as well. This is not ideal because it’ll result in the REST API returning an unnecessarily large JSON.

So, I’ve decided to proceed with the first option, which is by relying on Poco::Json::Object instead. I hope I can get this done soon and get started with the front end development.


0 Comments

Leave a Reply

This site uses Akismet to reduce spam. Learn how your comment data is processed.