Skip to content

allow passing coordinate names as x and y to plot methods #608

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Oct 9, 2015
Merged

allow passing coordinate names as x and y to plot methods #608

merged 7 commits into from
Oct 9, 2015

Conversation

jhamman
Copy link
Member

@jhamman jhamman commented Oct 4, 2015

This allows us to use coordinate names as plotting arguments for x and y.

I have a test for this that isn't quite working yet.

@@ -45,6 +45,11 @@ def _infer_xy_labels(plotfunc, darray, x, y):

darray is a 2 dimensional data array.
"""

if all(v in darray.coords for v in {x, y}):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Set comprehensions don't work in Python 2.6 (which we still support, alas).

@shoyer
Copy link
Member

shoyer commented Oct 7, 2015

Maybe instead of the complex fallback logic to handle specifying x and/or y, we should rewrite this to only handle two cases:

  1. Both x and y are provided explicitly. They can be any coordinate variables.
  2. Neither x nor y is provided, so use dims to provide default values.

Something like this...

def _infer_xy_labels(plotfunc, darray, x, y):
    if x is None and y is None:
        if darray.ndim != 2:
            raise ValueError('must be 2d')
        x, y = darray.dims
    elif x is None or y is None:
        raise ValueError('cannot supply only one of x and y')
    elif any(k not in darray.coords for k in (x, y)):
        raise ValueError('x and y must be coordinate variables')
    return x, y

@jhamman
Copy link
Member Author

jhamman commented Oct 7, 2015

...only handle two cases

This is exactly where I was headed.

elif x is None or y is None:
raise ValueError('cannot supply only one of x and y')
elif any(k not in darray.coords for k in (x, y)):
raise ValueError('x and y must be coordinate variables %s' % darray.coords)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just print the coord names.

@jhamman
Copy link
Member Author

jhamman commented Oct 7, 2015

I think this is ready for a review. This fixes #611 and allows us to pass 2d coordinate names into plotting methods.

# Fixed FacetGrid labels
fg = da.plot.pcolormesh(col='time', col_wrap=4)

fixed_labels

# 2d coord names
ax = plt.axes(projection=Rasm())
ds.frac.plot.pcolormesh(x='xc', y='yc', transform=cartopy.crs.PlateCarree())

fixed2d

@@ -511,6 +469,11 @@ def imshow(x, y, z, ax, **kwargs):
The pixels are centered on the coordinates values. Ie, if the coordinate
value is 3.2 then the pixels for those coordinates will be centered on 3.2.
"""

if x.ndim != 1 or y.ndim != 1:
raise ValueError('Imshow requires 1D coordinates, try using '
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: I wouldn't capitalize imshow because it's a python identifier

@shoyer
Copy link
Member

shoyer commented Oct 7, 2015

Can we make it so the generic .plot(x='lon', y='lat') works? Right now it defaults to imshow is the dimensions are equally spaced but that logic isn't probably right anymore if we allow different x/y labels:
https://github.com/xray/xray/blob/cb4e4138fdb52078d27341dadf0feedb15e1de80/xray/plot/plot.py#L161

@jhamman
Copy link
Member Author

jhamman commented Oct 7, 2015

Can we make it so the generic .plot(x='x2d', y='x2d') works?

How about making the default pcolormesh since that works in all cases?

@shoyer
Copy link
Member

shoyer commented Oct 7, 2015

How about making the default pcolormesh since that works in all cases?

👍

@shoyer
Copy link
Member

shoyer commented Oct 7, 2015

While you're at it -- maybe the default .plot method shouldn't start with darray.squeeze()? That sort of behavior starts to make things behavior to predict. In particular, faceting should still work consistently even if one of the dimensions only has length 1.

@jhamman
Copy link
Member Author

jhamman commented Oct 7, 2015

so you would have a DataArray of shape (1, 5) be plotted via pcolormesh (by default)?

@shoyer
Copy link
Member

shoyer commented Oct 7, 2015

so you would have a DataArray of shape (1, 5) be plotted via pcolormesh (by default)?

Either that, or we should extract the row and col arguments first, to ensure they aren't squeezed out.

On the other hand, it's not so terrible to encourage people use to explicit plot method in applications instead of the generic .plot, so maybe this isn't entirely necessary.

@jhamman
Copy link
Member Author

jhamman commented Oct 7, 2015

I started to go down the road of removing the squeeze you were talking about but _infer_interval_breaks got in my way (when len(coord) == 1). I think this is an important corner case but this PR is getting pretty cluttered so maybe it is best to save that for another.

@shoyer
Copy link
Member

shoyer commented Oct 7, 2015

OK, works for me :)

On Wed, Oct 7, 2015 at 2:40 PM, Joe Hamman [email protected] wrote:

I started to go down the road of removing the squeeze you were talking
about but _infer_interval_breaks got in my way (when len(coord) == 1). I
think this is an important corner case but this PR is getting pretty
cluttered so maybe it is best to save that for another.


Reply to this email directly or view it on GitHub
#608 (comment).

@jhamman
Copy link
Member Author

jhamman commented Oct 7, 2015

So, are we okay with the default 2D plot being pcolormesh? It certainly simplifies things since we don't have to determine if the coords are uniform and it works for 1D and 2D coords.

I think @clarkfitzg may have an objection.

@shoyer
Copy link
Member

shoyer commented Oct 7, 2015

Originally, we only had imshow and contourf. We picked imshow as the default because it shows individual pixels. But pcolormesh also meets that criteria, and has the convenience of also working on non-evenly spaced grid.

The only reason to possibly not pick pcolormesh is that it has a different convention from imshow in terms of the image orientation (top-left vs bottom-left origin). Given that xray is largely used for data with meaningful coordinate labels (rather than images), I think it's probably a better choice to use the bottom-left origin as a default. It composing better, for example, with 1D plots.

@@ -657,7 +707,7 @@ def test_imshow_called(self):
self.assertTrue(self.imshow_called(self.darray.plot.imshow))

def test_xy_pixel_centered(self):
self.darray.plot.imshow()
self.darray.plot.imshow(yincrease=None)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this become necessary?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

good catch. At one point this test was failing for me so I was playing around, trying to understand why. Forgot to roll back that change. Fixed in d793e3a.

@clarkfitzg
Copy link
Member

pcolormesh is more general, and works fine.

The problem, however, is time! How long does it take to plot and render with the two?

imshow can be orders of magnitude faster to plot and render.

@clarkfitzg
Copy link
Member

With a 1000 x 1000 grid I'm seeing 10 ms for imshow compared to 200 ms for pcolormesh. And much longer to render.

@jhamman
Copy link
Member Author

jhamman commented Oct 8, 2015

This is a good point. For me, its not a deal breaker though and the benefits of using pcolormesh all the time (simpler code, broad applicability) are worth the wait.

@shoyer
Copy link
Member

shoyer commented Oct 8, 2015

I think pcolormesh is slightly more likely to yield correct and consistent results for plotting different types of data than our current hybrid of imshow/contourf. In my experience, pcolormesh also works better with cartopy. For default behavior, I would lean towards being correct rather than fast. Of course, waiting ~0.5 seconds per plot does start to add up when you're faceting, but it's not much more trouble to type .imshow.

On Wed, Oct 7, 2015 at 7:04 PM, Joe Hamman [email protected]
wrote:

This is a good point. For me, its not a deal breaker though and the benefits of using pcolormesh all the time (simpler code, broad applicability) are worth the wait.

Reply to this email directly or view it on GitHub:
#608 (comment)

@clarkfitzg
Copy link
Member

Simplification of the logic is a good thing, so I'm fine with this.

We should call out the performance difference in the docs, maybe both in the main 2d plot section and the faceting section.

@jhamman
Copy link
Member Author

jhamman commented Oct 8, 2015

I will add a bit to the docs.

@jhamman
Copy link
Member Author

jhamman commented Oct 8, 2015

I have updated the docs with a note about speed.

I also updated the default behavior of xincrease and yincrease from None to True. By going the pcolormesh route, it basically forces us to do that (which isn't a bad thing).

@shoyer
Copy link
Member

shoyer commented Oct 8, 2015

I'm conflicted about what the default xincrease/yincrease keyword values should be. In practice, it seems like some datasets need them, and some don't. But it also depends on the specific application.

@jhamman could you please explain your reasoning a little more for why we need this switch? Do you have a comprehensive sense of the ways that people organize their data cubes?

With climate data, it does seem pretty common that datasets have decreasing latitude values (e.g., 90, 80, 70, ...). This works well with the default behavior of imshow, but not pcolormesh (unless yincrease=True). I suppose this is your motivation?

@jhamman
Copy link
Member Author

jhamman commented Oct 8, 2015

Yes, that was my motivation. Basically, switching to pcolormesh requires us to do this - the alternative is to add yincrease=True to all the 2d invocations of plot and pcolormesh in the docs.

I wouldn't say I have a comprehensive sense of how everyone orders their tick labels. In the earth sciences, which is my area of study, I'd say 95% of plots use the standard increasing y notation. The most common use cases for a decreasing y coordinate would probably be depth below the surface (where depth is increasingly positive).

One nice thing about setting yincrease=True by default, is that all our 2d plots come out with the same orientation. This is different than what you would get with matplotlib so I understand the hesitation.

@clarkfitzg
Copy link
Member

I think it's fine to have yincrease=True and xincrease=True everywhere as the defaults. This is how most people expect graphs to look. This is a place where internal consistency within xray probably beats conforming to the matplotlib defaults.

@shoyer
Copy link
Member

shoyer commented Oct 9, 2015

I'm OK with changing the default to yincrease=True. We might consider making this a globally settable option, if, for example, we start to get lots of people familiar with the other convention (popular for images) using xray.

@jhamman
Copy link
Member Author

jhamman commented Oct 9, 2015

I'm fine with adding a global option, but yes, we should wait until someone asks for that feature.

@shoyer
Copy link
Member

shoyer commented Oct 9, 2015

LGTM

jhamman pushed a commit that referenced this pull request Oct 9, 2015
allow passing coordinate names as x and y to plot methods
@jhamman jhamman merged commit f270b9f into pydata:master Oct 9, 2015
@jhamman jhamman deleted the feature/select_coords_for_plotting branch October 9, 2015 14:21
@jhamman
Copy link
Member Author

jhamman commented Oct 9, 2015

Thanks @shoyer and @clarkfitzg for the input on this. Funny how a small change can sometime unravel into a much larger issue.

@clarkfitzg
Copy link
Member

thanks @jhamman for making it work!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants