Week 5: Starting testing

October 10, 2017

Good morning blog followers. I apologize for not writing this last night per usual. I was at Grace Hopper this past week and did not have much time to work on the project. Next week I will be off for a week as my term will be coming to an end. I plan to continue working on the project, but I may not post next Monday and instead will write a long report the following week. Briefly, here's what I discussed this week with the graduate student in charge of the project.

Task

The goal this week is to start developing and testing one of the tools. Between the build-a-rank tool and the explain-a-rank tool, the explain tool makes more sense to me to start with. My plan is to learn about a regression version of the random forest classifier (I think it's called random regression?) and modify it to fit our needs. Here is my strategy for developing the tool.

Dummy Test Cases

One will be really simple, like with only one attribute. For example, the set Fruit, A1: apple, 2; banana, 4; pear, 5. If I gave the tool the ranking "apple, banana, pear", it should return "A1, ascending" as the attribute that explains the rank.

Another will be a bit more complicated by having a few attributes (e.g. Fruit, A1, A2: apple, 2, 10; banana, 4, 9; pear, 5, 2). If I gave the tool the ranking "pear, banana, apple", it should return "A2, descending".

Assumptions

For now, I will assume all attributes (except the prediction one) are continuous. I will also assume all the rankings are of equal distance (2 - 1 == 3 - 2 == 4 - 3).

Goal

Using one year's worth of college facts from CollegeScorecard, I'll create a couple different rankings by artificially weighting attributes. I should be able to learn those weights and the attributes to which they are applied by using my algorithm.

See you in two weeks, hopefully with some solid results!

Search This Blog

MaryAnn's Blog