Skip to content

Rank histogram #179

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 33 commits into from
May 21, 2019
Merged

Rank histogram #179

merged 33 commits into from
May 21, 2019

Conversation

tjmahr
Copy link
Collaborator

@tjmahr tjmahr commented Mar 20, 2019

This is a work-in-progress pull request to address #178.

So far I've added mcmc_trace_data() which prepares a dataframe of the values in each chain. While I was at it, I refactored mcmc_trace() to take advantage of this function.

The results mcmc_trace_data() includes a column value_rank. With this addition, we can very quickly make histogram of ranks.

Here is a prototype of the main steps.

library(bayesplot)
#> This is bayesplot version 1.6.0.9000
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#>    * Does _not_ affect other ggplot2 plots
#>    * See ?bayesplot_theme_set for details on theme setting
library(tidyverse)
#> Warning: package 'tibble' was built under R version 3.5.3
#> Warning: package 'purrr' was built under R version 3.5.3

x <- example_mcmc_draws(params = 4)

d <- bayesplot:::mcmc_trace_data(x)
n_iter <- unique(d$n_iterations)
n_chains <- unique(d$n_chains)

d_boundaries <- d %>% 
  dplyr::distinct(chain, parameter) 

d_boundaries <- dplyr::bind_rows(
    mutate(d_boundaries, value_rank = min(d$value_rank)),
    mutate(d_boundaries, value_rank = max(d$value_rank))
  )

# Print parameter and chain on one line
labeller <- function(x) label_value(x, multi_line = FALSE)

ggplot(d) + 
  aes(x = value_rank) + 
  geom_histogram(boundary = 0, bins = 20, color = "white") + 
  facet_wrap(
    c("parameter", "chain"), dir = "v",
    nrow = n_chains, 
    # Need free_x to have the x axis under each facet
    scales = "free_x", 
    labeller = labeller) +
  # Draw blank data to fix the range in each facet
  geom_blank(data = d_boundaries)

Created on 2019-03-20 by the reprex package (v0.2.1)

@jgabry
Copy link
Member

jgabry commented Mar 25, 2019

Thanks for working on this TJ!

@tjmahr
Copy link
Collaborator Author

tjmahr commented Apr 5, 2019

I've added mcmc_rank_overlay() which is inspired by Richard McElreath's version of the histograms.

I also added "viridisE" which is the "cividis" variant of viridis to the color set.

library(bayesplot)
#> This is bayesplot version 1.6.0.9000
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#>    * Does _not_ affect other ggplot2 plots
#>    * See ?bayesplot_theme_set for details on theme setting
x <- example_mcmc_draws(params = 4)
bayesplot::color_scheme_set("viridisE")
mcmc_rank_overlay(x, "alpha")

Created on 2019-04-05 by the reprex package (v0.2.1)

@jgabry
Copy link
Member

jgabry commented Apr 7, 2019

Nice! Let me know if/when this is ready for review.

@tjmahr tjmahr mentioned this pull request Apr 8, 2019
@tjmahr
Copy link
Collaborator Author

tjmahr commented Apr 10, 2019

I should add a horizontal line at uniform height... https://arviz-devs.github.io/arviz/generated/arviz.plot_rank.html

@avehtari
Copy link
Member

avehtari commented May 9, 2019

Do I read the pull request correctly that it doesn't yet have a function which would plot the rank histograms in different subplots? (I don't like the overlaid histograms as they get easily messy).

@tjmahr
Copy link
Collaborator Author

tjmahr commented May 10, 2019

Do I read the pull request correctly that it doesn't yet have a function which would plot the rank histograms in different subplots? (I don't like the overlaid histograms as they get easily messy). -- @ave

I've added mcmc_rank_hist().

library(bayesplot)
#> Registered S3 methods overwritten by 'ggplot2':
#>   method         from 
#>   [.quosures     rlang
#>   c.quosures     rlang
#>   print.quosures rlang
#> This is bayesplot version 1.6.0.9000
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#>    * Does _not_ affect other ggplot2 plots
#>    * See ?bayesplot_theme_set for details on theme setting
x <- example_mcmc_draws()
color_scheme_set("viridisE")
mcmc_rank_hist(x, c("beta[1]"))

mcmc_rank_hist(x, c("alpha", "beta[1]"))

Created on 2019-05-10 by the reprex package (v0.2.1)

@codecov-io
Copy link

codecov-io commented May 10, 2019

Codecov Report

❗ No coverage uploaded for pull request base (master@fb533f2). Click here to learn what that means.
The diff coverage is 100%.

Impacted file tree graph

@@            Coverage Diff            @@
##             master     #179   +/-   ##
=========================================
  Coverage          ?   99.33%           
=========================================
  Files             ?       30           
  Lines             ?     4209           
  Branches          ?        0           
=========================================
  Hits              ?     4181           
  Misses            ?       28           
  Partials          ?        0
Impacted Files Coverage Δ
R/bayesplot-colors.R 99.24% <ø> (ø)
R/ppc-distributions.R 100% <100%> (ø)
R/helpers-ppc.R 96.77% <100%> (ø)
R/mcmc-diagnostics.R 97.65% <100%> (ø)
R/helpers-gg.R 93.75% <100%> (ø)
R/mcmc-traces.R 100% <100%> (ø)

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update fb533f2...691fd80. Read the comment docs.

@avehtari
Copy link
Member

I've added mcmc_rank_hist().

Great! I would remove y axis

@avehtari avehtari mentioned this pull request May 19, 2019
Merge branch 'master' into rank-histogram

# Conflicts:
#	R/mcmc-traces.R
#	man/MCMC-traces.Rd
@tjmahr
Copy link
Collaborator Author

tjmahr commented May 21, 2019

I've added mcmc_rank_hist().

Great! I would remove y axis

I have removed the y-axis and created an option to show a reference line.

library(bayesplot)
#> This is bayesplot version 1.6.0.9000
#> - Online documentation and vignettes at mc-stan.org/bayesplot
#> - bayesplot theme set to bayesplot::theme_default()
#>    * Does _not_ affect other ggplot2 plots
#>    * See ?bayesplot_theme_set for details on theme setting
x <- example_mcmc_draws()
color_scheme_set("viridisE")
mcmc_rank_hist(x, c("beta[1]"))

mcmc_rank_hist(x, c("beta[1]"), ref_line = TRUE)

mcmc_rank_hist(x, c("alpha", "beta[1]"), n_bins = 10)

mcmc_rank_hist(x, c("alpha", "beta[1]"), ref_line = TRUE)

Created on 2019-05-21 by the reprex package (v0.3.0)

@tjmahr tjmahr changed the title [WIP] Rank histogram Rank histogram May 21, 2019
@tjmahr tjmahr requested a review from jgabry May 21, 2019 17:34
@tjmahr
Copy link
Collaborator Author

tjmahr commented May 21, 2019

Ready for review.

@jgabry
Copy link
Member

jgabry commented May 21, 2019

@tjmahr this looks great. Quick question: the size, np, and np_style arguments do not appear to be used inside mcmc_trace_data(). Can I remove these arguments?

@tjmahr
Copy link
Collaborator Author

tjmahr commented May 21, 2019

@tjmahr this looks great. Quick question: the size, np, and np_style arguments do not appear to be used inside mcmc_trace_data(). Can I remove these arguments?

Yes. Good catch. Then be sure to update the call to mcmc_trace_data() in .mcmc_trace().

@avehtari
Copy link
Member

I have removed the y-axis and created an option to show a reference line.

Looks good!

@jgabry
Copy link
Member

jgabry commented May 21, 2019

Actually same goes for the window argument, but for that one there’s a todo note about maybe having mcmc_trace_data() return only the specified window:

## @todo: filter to just window?

I’m ok with either option (keeping it as is or having mcmc_trace_data() handle the window), so your call. Did you want to decide on this before I merge this or hold off and maybe do it in the future?

@tjmahr
Copy link
Collaborator Author

tjmahr commented May 21, 2019

I think we can drop window and the note there too. I wasn't sure at the time when I wrote the note.

But we never do filter() the data to the window (we just set the coordinate limits to the window) so we don't need to make it part of the data preparation.

@jgabry
Copy link
Member

jgabry commented May 21, 2019

Ok sounds good. I'll take care of it.

@jgabry jgabry merged commit 5eced5b into master May 21, 2019
@jgabry jgabry deleted the rank-histogram branch May 21, 2019 21:17
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants