# First Steps #3: A Primer on Plots

This post #3 in our First Steps series. Want to learn Julia but don't know where to start? Start here!

Visualizing data is an essential skill for a data scientist.  Unlike R, Julia does not ship with plotting functionality built-in.  If you search for ways to make plots in Julia, you'll discover a lot of options.  So what should you use?

## 📊 Plots.jl

We recommend the Plots package (especially for beginners).

Plots is a unified interface for creating visualizations with different backends (such as GR, Plotly.js, and UnicodePlots).  It's great for beginners and power users both and it's designed such that a lot things you try will "just work".

### 💻 Install Plots

In the Julia REPL, add the Plots package if you haven't already done so.  Recall that you enter Pkg Mode by pressing `]`:

``(@v1.6) pkg> add Plots``

### 📈 Create Your First Plot

Back in Julia mode (by pressing delete), enter:

``````julia> using Plots

julia> plot(randn(10), title="My First Plot")``````

🎉 Congrats!  You made your first plot 📈!  You created it using:

1. `randn(10)`: A `Vector` of 10 random samples from a Normal(0,1) distribution.
2. The GR backend (Plots' default).

## ✨ Core Principles

The main function you'll use, as you may have guessed, is

``plot(args...; kw...)``

Here `args...` means any number of positional arguments and `kw...` is any number of keyword arguments.  Look back at the first plot we created and notice that we provided data `randn(10)` as a positional argument and the title `title="My First Plot"` as a keyword argument.  Another function you'll use is

``plot!(args...; kw...)``

In Julia, `!` is used as a convention to identify functions that mutate at least one of the arguments.  With Plots, this lets you make changes or additions to a plot.

Now that we know the functions we are using, let's look at the core principles:

### Principle #1: Every Thing You Plot is a Series

When you give data to the `plot` function (like `randn(10)` above), the `seriestype` determines how Plots will interpet the data.  By default this is `:path`.

``````plot(1:10, seriestype = :path, label = "Series 1")

plot!(rand(1:10,10), seriestype = :scatter, label = "Series 2")
``````

### Principle #2: Plot Attributes have Aliases

Plot attributes are passed by keyword arguments.   Because of aliases, you can often guess at the name of an attribute and Plots will interpret it correctly.  For example, the following commands are equivalent:

``````plot(randn(10), seriestype = :scatter)

plot(randn(10), st = :scatter)

scatter(randn(10))``````

### Principle #3: Columns are Mapped to Series

For both data and attributes, the columns of matrices will be mapped to individual series.  In this example, we create two series by providing a 10 x 2 matrix.  Now look at the difference between `p1` and `p2`.  If the `st` (`seriestype`) attribute is a vector, the provided attributes will loop through the available series.  If the `st` attribute is a matrix, the attributes in the i-th column will be mapped to the i-th series.  This provides a very succinct way of providing attributes to series.

``````x = randn(10, 2)

# Series 1 --> :scatter & :line
# Series 2 --> :scatter & :line
p1 = plot(x, st=[:scatter, :line])

# Series 1 --> :scatter
# Series 2 --> :line
p2 = plot(x, st=[:scatter :line])

plot(p1, p2)``````

### Principle #4:  Some Attributes are Magic 🪄

Some attributes can be provided with multiple values all at once and Plots will figure out what to do with them.  For example, using `m=(10, .5, "blue")` will set the marker size to `10`, the marker alpha (opacity) to `0.5`, and the marker color to `"blue"`.

``plot(randn(10), m = (10, .5, "blue"))``

### Principle #5: Many Types have Plot Recipes

This is best seen through example.  Let's add the RDatasets and OnlineStats packages via Pkg Mode in the REPL:

``(@v1.6) pkg> add OnlineStats RDatasets``

Now load the packages and retrieve the diamonds dataset that comes packaged with R's ggplot2.  The diamonds data is collection of variables on diamond price and quality.

``````using RDatasets, OnlineStats

df = dataset("ggplot2", "diamonds")
``````

Suppose the first thing we want to see is the distribution of the `:Cut` variable in our diamonds data.  We'll use `OnlineStats.CountMap` to count the number of occurrences for each unique value in the `:Cut` column.

When we plot the `CountMap`, a recipe is invoked to turn it into data that Plots knows how to display.  What recipes provide, other than say a `plot_countmap` function, is the ability to hook into plot attributes just as if you were plotting raw numbers.

``````o = CountMap(String)

fit!(o, string.(df.Cut))

plot(o, title="Neat!")``````

## Try This!

### Use a Different Backend

The backends of Plots can be changed interactively.  Try typing

``plotly()``

to switch to the interactive javascript library Plotly.js.  Then rerun the above examples.

## That's It!

Now you know Plots' core principles.  Time to try a few things on your own!

Enjoying Julia For Data Science?  Please share us with a friend and follow us on Twitter at @JuliaForDataSci.