Further Reading
Have a read through our longer introductory section here in your own time, as an approachable introduction to data visualisation! Also have a look at our course bibliography here
This course uses Python to introduce some concepts that are core to producing good data visualisations that can both help you to better understand your results, and to communicate your research findings to a wider audience. As stated in the course blurb, this is an introductory course that focuses on static plots that may be the basis for discussion or for presentation as a figure in a report as opposed to interactive visualisations such as dashboards.
In addition to walking you through some step-by-step advice regarding libraries that are available, and building plot templates that let you save out your figures to high resolution PNGs or PDF files, we are also going to discuss how best to represent your data in a way that is meaningful, aesthetically pleasing, and scientifically robust.
Before introducing the libraries and modules that are available, or the actual step by step of how to create plots using Python, we are going to walk through some of the important considerations when deciding how you are going to visualise your data.
Choosing the type of plot to use to represent your data is not something obvious or straightforward, and is something that will never have a unique solution: there are lots of good options that may win out in certain situations.
Sources and inspiration
A number of fantastic articles went into this resource. Please read these works: the content shared here is a much-abridged, simplified, shortened version of the content shared in the articles below.
Bibliography
Berinato, Scott. 2016. “Visualizations That Really Work.” Harvard Business Review, June 1, 2016. https://hbr.org/2016/06/visualizations-that-really-work.
Franconeri, Steven L., Lace M. Padilla, Priti Shah, Jeffrey M. Zacks, and Jessica Hullman. 2021. “The Science of Visual Data Communication: What Works.” Psychological Science in the Public Interest: A Journal of the American Psychological Society 22 (3): 110–61.
Kelleher, Christa, and Thorsten Wagener. 2011. “Ten Guidelines for Effective Data Visualization in Scientific Publications.” Environmental Modelling & Software: With Environment Data News 26 (6): 822–27.
Midway, Stephen R. 2020. “Principles of Effective Data Visualization.” Patterns (New York, N.Y.) 1 (9): 100141.
Rougier, Nicolas P., Michael Droettboom, and Philip E. Bourne. 2014. “Ten Simple Rules for Better Figures.” PLoS Computational Biology 10 (9): e1003833.
Other resources
- Berkeley Library Data Visualisation library guide
- Tableau documentation: Visual Best Practices
- from Data to Viz project
- Principles of Data Visualization workshop notes
- What’s visual ‘encoding’ in data viz, and why is it important?
- University of Utah visualization design lab
Packages used
matplotlib
J. D. Hunter, "Matplotlib: A 2D Graphics Environment", Computing in Science & Engineering, vol. 9, no. 3, pp. 90-95, 2007.
seaborn
Waskom, M. L., (2021). seaborn: statistical data visualization. Journal of Open Source Software, 6(60), 3021, https://doi.org/10.21105/joss.03021.
Plotly
Plotly Technologies Inc. Collaborative data science. Montréal, QC, 2015. https://plot.ly.
Streamlit
Streamlit documentation, https://docs.streamlit.io/
NumPy
Harris, C.R., Millman, K.J., van der Walt, S.J. et al. Array programming with NumPy. Nature 585, 357–362 (2020). DOI: 10.1038/s41586-020-2649-2.
pandas
Data structures for statistical computing in python, McKinney, Proceedings of the 9th Python in Science Conference, Volume 445, 2010.