Python is an incredibly useful programming language, especially for data visualization. Here, I will (1) show you how to download an Integrated Development Environment, (2) show you the basics of loops, if statements, and programming logic, (3) show you how to read in data from a CSV and manipulate it in Python, and (4) show you how to create graphs from that data.

1. Downloading and Installing an Integrated Development Environment (IDE)

There are multiple options, but PyCharm is one of the most useful IDEs.

  • https://www.jetbrains.com/pycharm/
  • Download Now
  • Choose your Operating System, and download the Community option
  • Follow the instructions to install
  • Choose Do not import settings if given the option

Here is a screenshot of the IDE download page:

Here is a video of the setup process:


(2) An introduction to programming logic

First off, create a new Python file! You can name it however you'd like, as long as there is a ".py" at the end. If you'd like some guidance, see the following video:


In the following video, I show you how to print to the console, use variables, and write "for loops". Some of the intricacies are demonstrated - I encourage you to experiment on your own!


Now that you know how to use variables and loops, let's explore a really useful data structure in Python - lists. I also introduce the concept of "if statements"!


Ok! Now that you understand lists, let's see how we can edit lists:


(3) Working with CSVs

Often, data is stored in CSVs - let's figure out how to read in data from a CSV! We'll import the CSV module (more info can be found here: https://docs.python.org/3/library/csv.html )


(4) Creating Boxplots

Here, I walk you through the basics of making and saving figures (boxplots) - there are many more types of graphs you can make with the seaborn module, and I encourage you to experiment! I also show a bit of debugging and troubleshooting to introduce you to the process. There are three videos that break the process up into main sections - installing Seaborn (this process can be used for any package), and creating figures.


I hope this helps give a broad overview of data visualization in Python!