Overview
This example loosely follows the example in Marchi chapter 4.
Data
The data for this example comes from the Teams table in the Lahman database.
Mapping Data
Following Marchi, we create two additional fields on the data.
- wpct - the win percentage which is the number of wins divided by the total number of games
- rd - the run deficit, which is defined to be the number of runs minus the number of runs allowed.
let data = $val('teams');
let data2 = data.filter(p=>p.yearID>2000).map(p=>{
let record = {
...p,
wpct: p.W/(p.W+p.L),
rd:p.R - p.RA
};
return record;
});
$val.set('data2', data2);
OLS Regression
In order to tease out the relationships between the variables in the data, use the OLS Regression app to run linear regressions against the data.
To view the regression of wpct against ra, select the transformed dataset. Set the y variable to wpct. Next, select ra in the x value combo box and click the add X button. This will add ra as an independent factor. Next, click the regress button.
In order to explore the relationships with the other variables, you can click the regress next button, which will change the x value to the next field in the dataset. You can scroll through the X values and peruse the regression statistics. Probably the import statistic to watch initially would be the R-squared which gives you an indication of how much of the win percentage is explained by the given X factor.
Charting
The charts on the OLS Regression app can give a visual display of the relationships. The primary way to visualize the relationship is to choose rd in the chart field drop down and then click the chart button. To scroll through the other factors to use as the X value, simply choose a different field in the drop down and click chart.