CPSC120B
Fundamentals of Computer Science I
Day 26 Notes
Histograms
Cool Computer Science Thing of the Day
No Reading Questions
No Quiz
Histograms
- A histogram is way of visualizing the distribution of data
- A histogram is a bar chart where the bar sizes coorespond to the the number of values in a range
- For example:
Code Breaking
- In addition to visualizing distribution, a histogram can be used to break a substitution cipher
- Recall the ceasar cipher you coded a couple of weeks ago, where characters are shifted some amount in the alphabet
- Breaking a message encoded this way is simple, just try all shifts until you can read the message.
- Recall the substitution cipher, each letter is mapped to a different letter
- Breaking this by brute force would require 26!, or 403291461126605635584000000, attempts
- If the message is long enough, we can take advantage of the language it is written in to break it more quickly
- In English, the most frequently occuring letter is ‘e’
- As an aside, this is why there are a lot of E tiles that aren’t worth very much in Scrabble.
- So, we find the most frequently occuring letter in the encoded message and replace it with ‘e’
- The second most frequently occuring letter is ‘t’, replace the second most frequently occuring with that
- Repeat until the message starts to make sense
- Finding the number of occurances of each letter is a histogram
Lab