+ - 0:00:00
Notes for current slide
Notes for next slide

Getting (re)-acquainted with R, RStudio, data wrangling, ggplot2, and plotly

Carson Sievert

Slides: https://bit.ly/plotcon17workshop

Slides released under Creative Commons

1 / 50

Your Turn

(1) Do you have the required software (i.e., could you run devtools::install_github('cpsievert/plotcon17') without error)?

(2) Share (with your neighbor) 3 things you're hoping to get from this workshop (share them with me, via Slack if you like!)

PS. remember this background image -- it means I want something from you!

2 / 50

About me

  • PhD in statistics from Iowa State (defended 4 months ago!)
  • Maintainer of plotly's R package (for nearly 2 years!)
    • Before that: animint, LDAvis, pitchRx, rdom

About the workshop

  • Focus on things that are hard to learn from documentation alone.
  • Today is mainly about core ideas (& lots of mapping examples).
  • Tomorrow is a bit more advanced: animation, linked views, & shiny.
  • I'm hoping to save time today/tomorrow for a Q&A session.
    • Feel free to stop me at any time.
    • Feel free to ask me about personal projects during down time.

Did anyone attend my talk on Tuesday?

3 / 50

About the attendees

https://plot.ly/ggplot2/geom_density/

4 / 50

About the attendees, another look

https://plot.ly/r/parallel-coordinates-plot/

5 / 50

R wisdom

Everything that exists is an object.

Everything that happens is a function call.

-- John Chambers

6 / 50

R wisdom

Everything that exists is an object.

Everything that happens is a function call.

-- John Chambers

7 / 50

Universal truths

Use the str() function to inspect any object (View() in RStudio is a nice interactive alternative).

str(mtcars)
#> 'data.frame': 32 obs. of 11 variables:
#> $ mpg : num 21 21 22.8 21.4 18.7 18.1 14.3 24.4 22.8 19.2 ...
#> $ cyl : num 6 6 4 6 8 6 8 4 4 6 ...
#> $ disp: num 160 160 108 258 360 ...
#> $ hp : num 110 110 93 110 175 105 245 62 95 123 ...
#> $ drat: num 3.9 3.9 3.85 3.08 3.15 2.76 3.21 3.69 3.92 3.92 ...
#> $ wt : num 2.62 2.88 2.32 3.21 3.44 ...
#> $ qsec: num 16.5 17 18.6 19.4 17 ...
#> $ vs : num 0 0 1 1 0 1 0 1 1 1 ...
#> $ am : num 1 1 1 0 0 0 0 0 0 0 ...
#> $ gear: num 4 4 4 3 3 3 3 4 4 4 ...
#> $ carb: num 4 4 1 1 2 1 4 2 2 4 ...

Use <- to assign value(s) to a name

nms <- names(mtcars)
nms
#> [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
#> [11] "carb"
8 / 50

The pipe operator (%>%)

Takes object on LHS and inserts into function on RHS.

library(magrittr)
mtcars %>% names()
#> [1] "mpg" "cyl" "disp" "hp" "drat" "wt" "qsec" "vs" "am" "gear"
#> [11] "carb"

Makes function composition more readable

# read left-to-right
mtcars %>% names() %>% length()
#> [1] 11
# not inside out
length(names(mtcars))
#> [1] 11
9 / 50

R's basic data structures

Read Hadley Wickham's brilliant chapter on data structures http://adv-r.had.co.nz/Data-structures.html

10 / 50

A data frame holds (homo or hetero!) 1d vectors.

Read Hadley Wickham's brilliant chapter on data structures http://adv-r.had.co.nz/Data-structures.html

11 / 50

More than a table

Watch Jenny Bryan's brilliant talk https://www.youtube.com/watch?v=4MfUCX_KpdE

12 / 50

How is this useful?

13 / 50

What data goes into drawing this map?

14 / 50
library(albersusa)
usa <- usa_sf("laea")
library(dplyr)
usa %>%
select(name, pop_2010, geometry) %>%
View()
usa_sf("laea") %>% select(pop_2010, geometry) %>% plot()

15 / 50

Rows should represent the unit of interest!!!

Hadley Wickham (probably)

16 / 50

An aside on dplyr

The R package dplyr makes common SQL-like operations fast and easy1

Important single table operations:

  • select()
  • mutate()
  • filter()
  • arrange()
  • distinct()
  • summarise()

I will use dplyr sporadically through the workshop...please stop me if anything needs more explaining

[1]: It will even perform SQL queries for you -- using the same interface!

17 / 50

Your turn

See help(geom_sf, package = "ggplot2"). Can you plot population by state using geom_sf() and usa_sf("laea")?

Bonus: Use plotly::ggplotly() to convert it to an interactive version!

Solution is here

PS. Someone commented "I would really like to learn more about working with shapefiles". Hopefully sf::st_read() just works for you!

18 / 50
library(plotly)
usa_sf <- mutate(
usa_sf("laea"), txt = paste("The state of", name, "had \n", pop_2010, "people in 2010")
)
p <- ggplot(usa_sf) +
geom_sf(aes(fill = pop_2010, text = txt))
ggplotly(p, tooltip = "text")
19 / 50

More compelling examples

Tooltips & zooming are cool -- but we can do more!

20 / 50

21 / 50

Add 2 lines, & voila!

22 / 50

23 / 50

See demo("highlight-epl", package = "plotly")

24 / 50

Disclaimer

I'd say ~80% of the ggplot2 API is correctly translated by ggplotly().

I'm aiming for ~99% before the end of the year.

Regardless, knowing how it all works helps to workaround limitations & specify additional features not supported by the ggplot2 API

25 / 50

How does it work?

26 / 50

ggplotly returns a plotly htmlwidget

class(p)
#> [1] "gg" "ggplot"
gg <- ggplotly(p, tooltip = "text")
class(gg)
#> [1] "plotly" "htmlwidget"

The htmlwidgets framework guarantees things just work in any context.1

The htmlwidgets gallery has 85 registered widgets to date! http://gallery.htmlwidgets.org/

[1]: For example, at your R prompt, inside RStudio, rmarkdown, or shiny apps

27 / 50

Your Turn

Embed the gg map in an rmarkdown document.

Bonus: get the plot to print in an r notebook

28 / 50

What happens when you print a plotly htmlwidget?

All htmlwidgets take this same (R list -> JSON -> JavaScript -> HTML) approach!

Every htmlwidget is defined through an R list. Any R list maps to JSON through jsonlite package

29 / 50

Mapping R list to JSON

barchart <- list(
data = list(list(
x = c("a", "b", "c"),
y = c(1, 2, 3),
type = "bar"
))
)
plotly:::to_JSON(barchart, pretty = TRUE)
#> {
#> "data": [
#> {
#> "x": ["a", "b", "c"],
#> "y": [1, 2, 3],
#> "type": "bar"
#> }
#> ]
#> }

Pro tip: Did you know ::: can access any object from any package (exported or not)?

30 / 50

Indexing/subsetting in R

Grab a list element with $ or [[

str(barchart$data)
#> List of 1
#> $ :List of 3
#> ..$ x : chr [1:3] "a" "b" "c"
#> ..$ y : num [1:3] 1 2 3
#> ..$ type: chr "bar"
identical(barchart$data, barchart[["data"]])
#> [1] TRUE

There is also [, which always returns the "container"!

str(barchart["data"])
#> List of 1
#> $ data:List of 1
#> ..$ :List of 3
#> .. ..$ x : chr [1:3] "a" "b" "c"
#> .. ..$ y : num [1:3] 1 2 3
#> .. ..$ type: chr "bar"
31 / 50
str(mtcars["vs"])
#> 'data.frame': 32 obs. of 1 variable:
#> $ vs: num 0 0 1 1 0 1 0 1 1 1 ...
str(mtcars[["vs"]])
#> num [1:32] 0 0 1 1 0 1 0 1 1 1 ...
32 / 50

Mapping R list to plotly

library(plotly)
as_widget(barchart)
abc0123
33 / 50

PSA: use plot_ly() over as_widget()

# plot_ly() adds some useful abstractions that we'll get to later
plot_ly() %>%
add_bars(
x = c("a", "b", "c"),
y = c(1, 2, 3),
unsupported = "nonsense"
)
#> Warning: 'bar' objects don't have these attributes: 'unsupported'
#> Valid attributes include:
#> 'type', 'visible', 'showlegend', 'legendgroup', 'opacity', 'name', 'uid', 'hoverinfo', 'stream', 'x', 'x0', 'dx', 'y', 'y0', 'dy', 'text', 'hovertext', 'textposition', 'textfont', 'insidetextfont', 'outsidetextfont', 'orientation', 'base', 'offset', 'width', 'marker', 'r', 't', 'error_y', 'error_x', '_deprecated', 'xaxis', 'yaxis', 'xcalendar', 'ycalendar', 'xsrc', 'ysrc', 'textsrc', 'hovertextsrc', 'textpositionsrc', 'basesrc', 'offsetsrc', 'widthsrc', 'rsrc', 'tsrc', 'key', 'set', 'frame', 'transforms', '_isNestedKey', '_isSimpleKey', '_isGraticule'
abc00.511.522.53
34 / 50

Three ways to widget1

ggplotly(): translates ggplot to widget

as_widget(): translates R lists to widget

plot_ly(): translate a custom R-specific grammar to widget

[1]: Actually, four, if you count api_download_file()

35 / 50

Inspect the JSON behind any widget

# In recent versions of RStudio -- gg %>% plotly_build() %>% View()
plotly_json(gg)

The data, layout, and config attributes are official plotly.js attributes covered in the figure reference.

The other attributes are unique to the R package (don't worry about them).

36 / 50

Modify any widget

style() modifies data attributes. layout() modifies the layout.

gg2 <- gg %>%
style(mode = "markers+lines", traces = 2) %>%
layout(title = "A map of 2010 population", margin = list(t = 30))
37 / 50

Note the modification!

# The 'x' element stores the list converted to JSON
# plotly_json() just provides a more pleasant interface to gg$x
str(gg2$x$data[[2]])
#> List of 14
#> $ x : num [1:238] -324546 -325004 -325571 -325589 -326462 ...
#> $ y : num [1:238] -110164 -119390 -130838 -131207 -149343 ...
#> $ text : chr "The state of Wyoming had <br /> 564358 people in 2010"
#> $ type : chr "scatter"
#> $ mode : chr "markers+lines"
#> $ line :List of 3
#> ..$ width: num 1.89
#> ..$ color: chr "rgba(89,89,89,1)"
#> ..$ dash : chr "solid"
#> $ fill : chr "toself"
#> $ fillcolor : chr "rgba(19,43,67,1)"
#> $ hoveron : chr "fills"
#> $ showlegend: logi FALSE
#> $ xaxis : chr "x"
#> $ yaxis : chr "y"
#> $ hoverinfo : chr "text"
#> $ frame : chr NA
38 / 50

Can also add data to any widget

There are a number of add_*() functions (e.g., add_bars(), add_polygons(), add_trace()).

d <- gg$x$data[[52]]
add_polygons(gg, x = d$x, y = d$y, color = I("red"), inherit = FALSE)
39 / 50

Resources for studying the figure reference

https://plot.ly/r/reference/

https://github.com/rreusser/plotly-doc-viewer

# In recent versions of RStudio -- View(plotly:::Schema)
schema()
40 / 50

Your Turn

Overlay text on top of Wyoming using either a scatter trace with text mode or an annotation

Tip: Use sf::st_centroid() to find the center point of polygon(s).

Bonus: Can you label all the states?

Solution is here

41 / 50

Raster objects

  • Raster objects are basically a matrix of color codes. These objects can be used to represent bitmap images.
m <- matrix(hcl(0, 80, seq(50, 80, 10)), nrow = 4, ncol = 5)
(r <- as.raster(m))
#> [,1] [,2] [,3] [,4] [,5]
#> [1,] "#C54E6D" "#C54E6D" "#C54E6D" "#C54E6D" "#C54E6D"
#> [2,] "#E16A86" "#E16A86" "#E16A86" "#E16A86" "#E16A86"
#> [3,] "#FE86A1" "#FE86A1" "#FE86A1" "#FE86A1" "#FE86A1"
#> [4,] "#FFA2BC" "#FFA2BC" "#FFA2BC" "#FFA2BC" "#FFA2BC"
plot(r)

42 / 50

Embedding raster objects in plotly

plot_ly() %>%
layout(images = list(
source = raster2uri(r), # converts a raster object to a data URI.
xref = "x", yref = "y", x = 0, y = 0, sizex = 1, sizey = 1,
sizing = "stretch", xanchor = "left", yanchor = "bottom"
))
43 / 50

ggmap objects are bitmap images!

library(ggmap)
basemap <- get_map(maptype = "satellite", zoom = 8)
p <- ggmap(basemap) +
geom_polygon(aes(x = lon, y = lat, group = plotOrder),
data = zips, colour = "black", fill = NA) +
ggthemes::theme_map()
44 / 50

Add zoom/pan/tooltips via ggplotly()

ggplotly(p)
45 / 50

A crossroads in ggplot2 mapping

  • Going forward, geom_sf() will be the preferred way to map in ggplot2, but it is still under development.
  • It will support any map projection, thanks to the magic of sf.
library(maps)
library(sf)
world1 <- st_as_sf(map('world', plot = FALSE, fill = TRUE))
ggplot() + geom_sf(data = world1)
46 / 50

Finding map projections

47 / 50

Mollweide projection of Canada

# http://spatialreference.org/ref/sr-org/7/proj4/
canada <- subset(world1, ID == "Canada")
canada2 <- st_transform(canada,"+proj=moll +lon_0=0 +x_0=0 +y_0=0 +ellps=WGS84 +units=m +no_defs")
ggplot() + geom_sf(data = canada2)
48 / 50

Now with ggplotly()

ggplotly()
49 / 50

Your turn

(1) Peruse some of the examples on https://plot.ly/r/maps/. Which approach do you like best (ggplotly(), plot_geo(), or plot_mapbox())? Let me know via Slack. Can you point out some advantages/disadvantages to each approach?

(2) See the last example on https://plot.ly/r/lines-on-maps/ -- how does the plot know to render in 3D? Can you make a 2D version?

Not interested in maps? Peak through tomorrow's slides. Tell me if you want to see something else!

Have a personal project (related to plotly) that you need help with? Ask me!

50 / 50

Your Turn

(1) Do you have the required software (i.e., could you run devtools::install_github('cpsievert/plotcon17') without error)?

(2) Share (with your neighbor) 3 things you're hoping to get from this workshop (share them with me, via Slack if you like!)

PS. remember this background image -- it means I want something from you!

2 / 50
Paused

Help

Keyboard shortcuts

, , Pg Up, k Go to previous slide
, , Pg Dn, Space, j Go to next slide
Home Go to first slide
End Go to last slide
Number + Return Go to specific slide
b / m / f Toggle blackout / mirrored / fullscreen mode
c Clone slideshow
p Toggle presenter mode
t Restart the presentation timer
?, h Toggle this help
Esc Back to slideshow