Play by Play Football Data

Overview


This example calculates the average number of yards per play for the 2000 season. The 'yards_gained' field contains the number of yards gained on a play. For plays for which this metric is not relevant, the field is empty, and will be translated to an empty string "" when loaded.

This example uses array functions and the $list api to transform the data as necessary.

Example Code and Data Cleaning


This example assumes that the play by play file has been loaded into a workspace with the name "play_by_play_2000".

  • First, the script retrieves the data from the workspace using $val
  • Next, the script uses the $list api to select the yards_gained field, while filtering out the records where this field is the empty string ""
  • It calls the average function on the resulting list
  • Prints out the result

let data = $val('play_by_play_2000');

let avYard = $list(data).map(p=>p.yards_gained).filter(p=>p !== '').average();
$console.log(avYard);
					

Filtering by Play Type


Once you have written the code to calculate the average number of yards gained, you can filter the data in order to calculate the average yards gained by play type. Here, we re-reun the calculation for the pass play and the run play.

let data = $val('play_by_play_2000');

let avPassYard = $list(data).filter(p=>p.play_type === 'pass').map(p=>p.yards_gained).filter(p=>p !== '').average();
let avRunYard = $list(data).filter(p=>p.play_type === 'run').map(p=>p.yards_gained).filter(p=>p !== '').average();

$console.log(avPassYard);
$console.log(avRunYard);
					

Max Values


Instead of calculating average values, the max value could be obtained instead.

let data = $val('play_by_play_2000');

let maxPassYard = $list(data).filter(p=>p.play_type === 'pass').map(p=>p.yards_gained).filter(p=>p !== '').max();
let maxRunYard = $list(data).filter(p=>p.play_type === 'run').map(p=>p.yards_gained).filter(p=>p !== '').max();

$console.log(maxPassYard);
$console.log(maxRunYard);
					

Histogram


In order to visualize the dispersion of the number of yards gained for each play, it is instructive to display the yards in a histogram

let data = $val('play_by_play_2000');

let data2 = data
            .map(p=>({
                yards_gained:p.yards_gained,
                play_type:p.play_type
            }))
            .filter(p=>p.play_type === 'pass')
            .filter(p=>p.yards_gained !== '');
            
            
$val.set('data2', data2);