A Python notebook created for my CODELancashire studies.
Named 'the best presentation by a student ever' by my tutor.
Tools
Python, Numpy, Pandas, Matplotlib, Jupyter
The mission
Take any dataset we wanted, use the Python we've learned to manipulate the data, then present it to the class.
What I did
- Found a database of baseball statistics ranging from the 1800s to the modern day. (Source: Kaggle)
- Cleaned up the data, fixing some columns that had mixed data types and removing records that I didn't feel contributed to the project - making it faster, and making for better quality analysis.
- Used sorting, filtering and aggregating to find the most effective players in various categories across single seasons, eras and full careers.
- Graphed the data to identify how trends have changed over time.
- Invented a custom formula to identify my number one player of all time.
- Added headings, commentary, photos and gifs to add clarity, explain my thoughts and make the presentation more viewer-friendly.
The outcome
Not only did I get a 10/10 score from my tutor Andre, but he said: "That was the best presentation by a student, like... ever."
(and also "It was almost good enough to make baseball seem interesting.")
