Use the command line to create a new directory called lab22 in your labs directory. Make sure all of the .py files that you create for this activity are in that directory.
The World Series is currently in progress, with the Kansas City Royals taking on the San Francisco Giants. If you are a baseball fan, hopefully you were a fan of one of these teams. If you are like me, however, you root for a different team. Maybe your favorite team would have done better if they had written programs to compute player statistics, like batting average. Batting average is the ratio of hits to at bats. To compute a player's batting average you have to count the different types of hits a player makes.
Details
Create a Python function
called batting_average(appearances)
in the
file batting_average.py The function
parameter, appearances
, is a list of ints that encodes
the results of a baseball player's appearances at batting. Each
plate appearance can be interpreted as follows:
4 | Home Run |
3 | Triple |
2 | Double |
1 | Single |
0 | Walk |
-1 | Out |
The function should return the batting average for the plate appearances in the list. The batting average can be computed using the following equations:
The function should not use the list count
method.
Example
>>> print(batting_average([1, 1, -1])) 0.6666666666 >>> pedroia_at_bats = [-1, 1, -1, -1, 1, -1, -1, 2, -1, -1, -1, -1, -1, -1, -1, -1, 1, -1, 0, -1, 0, -1, -1, -1, 2, 1, -1, -1, -1, -1, 1, 0, -1, -1, -1, 1, -1, -1, 1, -1, 1, -1, 0, -1, 1, 1, -1, -1, 2, 0, -1, -1, -1, -1, -1, -1, -1, 1, -1, 2, -1, -1, -1, -1, -1, -1, -1, 1] >>> print(batting_average(pedroia_at_bats)) 0.25396825396825395
Hint
The batting_average
function would probably be much
easier if you had some mechanism to count the number of
appearances of some value in the list. Towards that end, create a
Python function called count(a_list, element)
in
the same file. The function should return the number of times
element occurs in the the list a_list. It would be
wise to test this function before you try to use it in
batting_average
.
Challenge
Sabermetrics is the application of statistics to the management of
baseball teams that was popularized with the the book
Moneyball. Sabermetric tries to improve upon the batting average
metric by not just computing a batter's hits but by also
incorporating a batter's contribution to the number of runs
scored. Create a function runs_created(appearances)
that returns the Sabermetrics runs created statistic for the
specified appearances list. The runs created stat can be computed
with the following formulas:
$$total\_bases = (singles) + (2 × doubles) + (3 × triples) + (4 × home\_runs)\\ runs\_created = ((hits + walks) × (total\_bases)) / plate\_appearances$$
>>> ortiz = [-1, 1, 4, -1, 0, 4, 1, -1, 0, 1, 0, 1, 2, 0, 1, 2, 1, -1, 1, 0, 0, 0, -1, 0] >>> print(runs_created(ortiz)) 15.04166666666666
Suppose someone comes into the class, and bets you $100 that some two people share a birthday within the individuals of the class. Do you share your birthday with anyone in the class? Is there a shared birthday between any two people in the class? It seems that the odds would be pretty slim. However, if you are putting up $100 on this bet, you might want to figure out your odds of winning this bet.
Details
In a file called birthdays.py, create a function called
shared_probability(group_size)
. This function takes
one positive integer parameter, the number of people in a given
group. This function should return a floating point value in the
range \([0, 1]\), a probability that there is a shared birthday in a
group of size group_size
. The function should not
compute the exact probability, instead it should approximate the
probability with repeated simulation.
What is the minimum group size that the probability exceeds 50%? What is the minimum group size that there is a virtual guarantee (0.9999) there is a shared birthday?
>>> print(shared_probability(1)) 0.0 >>> print(shared_probability(366)) 1.0
Hint
The shared_probability
function should execute a fixed
number (some constant defined number greater than one) of simulations
for a given group size. Each simulation consists of generating the
requested number of random birthdays and determining if there is a
shared birthday in the group.
You can execute a simulation by creating a list of the required size, and for each entry in the list chosing a random number in the range \([0, 365)\). There is a shared birthday if there is a duplicate number somewhere in this list.
To compute the actual probability, execute a simulation some arbitrarily large number of times (1000 should be good). Count the number of times you find a shared birthday in these simulations. Then you just need to divide this count by the number of simulations run.
Challenge
The above program assumes that birthdays are uniformly distributed: For a given person, their odds of being born on a particular date is the same for all dates. However, it is probably pretty obvious to you that this is not true in real life cases. In fact, you are much more likely to be born in the months July - October than any other month in the year.
How could you use the random number generator, which outputs numbers
in a uniform distribution, to generate some skewed non-uniform random
distribution? Try to redo your shared_probability
function using this new, non-uniform distribution of birthdays. How
do the results for the above questions change? Do they get higher, or
lower? You may assume, for simplicity sakes, that all months have 30
days for this challenge portion.
Submission
Please show your source code and run your programs for the instructor or lab assistant. Only a programs that have perfect style and flawless functionality will be accepted as complete.