Skip to content

Commit 84b326a

Browse files
authored
List "Unindexed dimensions" separately in __repr__ (#1221)
Fixes GH1199 Previously (on master): <xarray.Dataset> Dimensions: (x: 1, y: 2) Coordinates: o x (x) - o y (y) - Data variables: foo (x, y) int64 1 2 Now: <xarray.Dataset> Dimensions: (x: 1, y: 2) Coordinates: *empty* Unindexed dimensions: x, y Data variables: foo (x, y) int64 1 2 This version of the `__repr__` should be much more self-explanatory.
1 parent 80fbc6e commit 84b326a

File tree

6 files changed

+92
-46
lines changed

6 files changed

+92
-46
lines changed

doc/data-structures.rst

Lines changed: 12 additions & 8 deletions
Original file line numberDiff line numberDiff line change
@@ -48,7 +48,7 @@ The :py:class:`~xarray.DataArray` constructor takes:
4848
:py:class:`~pandas.Series`, :py:class:`~pandas.DataFrame` or :py:class:`~pandas.Panel`)
4949
- ``coords``: a list or dictionary of coordinates
5050
- ``dims``: a list of dimension names. If omitted, dimension names are
51-
taken from ``coords`` if possible
51+
taken from ``coords`` if possible.
5252
- ``attrs``: a dictionary of attributes to add to the instance
5353
- ``name``: a string that names the instance
5454

@@ -69,15 +69,19 @@ in with default values:
6969
7070
As you can see, dimension names are always present in the xarray data model: if
7171
you do not provide them, defaults of the form ``dim_N`` will be created.
72+
However, coordinates are optional. If you do not specific coordinates for a
73+
dimension, the axis name will appear under the list of "Unindexed dimensions".
7274

7375
.. note::
7476

75-
Prior to xarray v0.9, coordinates corresponding to dimension were *also*
76-
always present in xarray: xarray would create default coordinates of the form
77-
``range(dim_size)`` if coordinates were not supplied explicitly. This is no
78-
longer the case.
77+
This is different from pandas, where axes always have tick labels, which
78+
default to the integers ``[0, ..., n-1]``.
7979

80-
Coordinates can take the following forms:
80+
Prior to xarray v0.9, xarray copied this behavior: default coordinates for
81+
each dimension would be created if coordinates were not supplied explicitly.
82+
This is no longer the case.
83+
84+
Coordinates can be specified in the following ways:
8185

8286
- A list of values with length equal to the number of dimensions, providing
8387
coordinate labels for each dimension. Each value must be of one of the
@@ -243,8 +247,8 @@ Creating a Dataset
243247
To make an :py:class:`~xarray.Dataset` from scratch, supply dictionaries for any
244248
variables (``data_vars``), coordinates (``coords``) and attributes (``attrs``).
245249

246-
- ``data_vars`` should be a dictionary with each key as the name of the variable and each
247-
value as one of:
250+
- ``data_vars`` should be a dictionary with each key as the name of the variable
251+
and each value as one of:
248252

249253
* A :py:class:`~xarray.DataArray` or :py:class:`~xarray.Variable`
250254
* A tuple of the form ``(dims, data[, attrs])``, which is converted into

doc/whats-new.rst

Lines changed: 19 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -31,7 +31,25 @@ Breaking changes
3131
~~~~~~~~~~~~~~~~
3232

3333
- Index coordinates for each dimensions are now optional, and no longer created
34-
by default :issue:`1017`. This has a number of implications:
34+
by default :issue:`1017`. You can identify such dimensions without indexes by
35+
their appearance in list of "Unindexed dimensions" in the ``Dataset`` or
36+
``DataArray`` repr:
37+
38+
.. ipython::
39+
:verbatim:
40+
41+
In [1]: xr.Dataset({'foo': (('x', 'y'), [[1, 2]])})
42+
Out[1]:
43+
<xarray.Dataset>
44+
Dimensions: (x: 1, y: 2)
45+
Coordinates:
46+
*empty*
47+
Unindexed dimensions:
48+
x, y
49+
Data variables:
50+
foo (x, y) int64 1 2
51+
52+
This has a number of implications:
3553

3654
- :py:func:`~align` and :py:meth:`~Dataset.reindex` can now error, if
3755
dimensions labels are missing and dimensions have different sizes.

xarray/core/dataset.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -276,7 +276,7 @@ def __getitem__(self, key):
276276
raise KeyError(key)
277277

278278
def __unicode__(self):
279-
return formatting.vars_repr(self)
279+
return formatting.data_vars_repr(self)
280280

281281
@property
282282
def variables(self):

xarray/core/formatting.py

Lines changed: 23 additions & 18 deletions
Original file line numberDiff line numberDiff line change
@@ -211,13 +211,6 @@ def _summarize_var_or_coord(name, var, col_width, show_values=True,
211211
return front_str + values_str
212212

213213

214-
def _summarize_dummy_var(name, col_width, marker=u'o', values=u'-'):
215-
"""Used if there is no coordinate for a dimension."""
216-
first_col = pretty_print(u' %s %s ' % (marker, name), col_width)
217-
dims_str = u'(%s) ' % unicode_type(name)
218-
return u'%s%s%s' % (first_col, dims_str, values)
219-
220-
221214
def _summarize_coord_multiindex(coord, col_width, marker):
222215
first_col = pretty_print(u' %s %s ' % (marker, coord.name), col_width)
223216
return u'%s(%s) MultiIndex' % (first_col, unicode_type(coord.dims[0]))
@@ -248,8 +241,6 @@ def summarize_var(name, var, col_width):
248241

249242

250243
def summarize_coord(name, var, col_width):
251-
if var is None:
252-
return _summarize_dummy_var(name, col_width)
253244
is_index = name in var.dims
254245
show_values = is_index or _not_remote(var)
255246
marker = u'*' if is_index else u' '
@@ -305,8 +296,8 @@ def _mapping_repr(mapping, title, summarizer, col_width=None):
305296
return u'\n'.join(summary)
306297

307298

308-
vars_repr = functools.partial(_mapping_repr, title=u'Data variables',
309-
summarizer=summarize_var)
299+
data_vars_repr = functools.partial(_mapping_repr, title=u'Data variables',
300+
summarizer=summarize_var)
310301

311302

312303
attrs_repr = functools.partial(_mapping_repr, title=u'Attributes',
@@ -316,12 +307,7 @@ def _mapping_repr(mapping, title, summarizer, col_width=None):
316307
def coords_repr(coords, col_width=None):
317308
if col_width is None:
318309
col_width = _calculate_col_width(_get_col_items(coords))
319-
# augment coordinates to include markers for missing coordinates
320-
augmented_coords = OrderedDict(coords)
321-
for dim in coords.dims:
322-
if dim not in augmented_coords:
323-
augmented_coords[dim] = None
324-
return _mapping_repr(augmented_coords, title=u'Coordinates',
310+
return _mapping_repr(coords, title=u'Coordinates',
325311
summarizer=summarize_coord, col_width=col_width)
326312

327313

@@ -337,6 +323,15 @@ def dim_summary(obj):
337323
return u', '.join(elements)
338324

339325

326+
def unindexed_dims_repr(dims, coords):
327+
unindexed_dims = [d for d in dims if d not in coords]
328+
if unindexed_dims:
329+
dims_str = u', '.join(u'%s' % d for d in unindexed_dims)
330+
return u'Unindexed dimensions:\n' + u' ' * 4 + dims_str
331+
else:
332+
return None
333+
334+
340335
@contextlib.contextmanager
341336
def set_numpy_options(*args, **kwargs):
342337
original = np.get_printoptions()
@@ -386,6 +381,10 @@ def array_repr(arr):
386381
if arr.coords:
387382
summary.append(repr(arr.coords))
388383

384+
unindexed_dims_str = unindexed_dims_repr(arr.dims, arr.coords)
385+
if unindexed_dims_str:
386+
summary.append(unindexed_dims_str)
387+
389388
if arr.attrs:
390389
summary.append(attrs_repr(arr.attrs))
391390

@@ -401,7 +400,13 @@ def dataset_repr(ds):
401400
summary.append(u'%s(%s)' % (dims_start, dim_summary(ds)))
402401

403402
summary.append(coords_repr(ds.coords, col_width=col_width))
404-
summary.append(vars_repr(ds.data_vars, col_width=col_width))
403+
404+
unindexed_dims_str = unindexed_dims_repr(ds.dims, ds.coords)
405+
if unindexed_dims_str:
406+
summary.append(unindexed_dims_str)
407+
408+
summary.append(data_vars_repr(ds.data_vars, col_width=col_width))
409+
405410
if ds.attrs:
406411
summary.append(attrs_repr(ds.attrs))
407412

xarray/test/test_dataarray.py

Lines changed: 2 additions & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -46,7 +46,8 @@ def test_repr(self):
4646
Coordinates:
4747
* x (x) int64 0 1 2
4848
other int64 0
49-
o time (time) -
49+
Unindexed dimensions:
50+
time
5051
Attributes:
5152
foo: bar""")
5253
self.assertEqual(expected, repr(data_array))

xarray/test/test_dataset.py

Lines changed: 35 additions & 17 deletions
Original file line numberDiff line numberDiff line change
@@ -91,7 +91,8 @@ def test_repr(self):
9191
* dim2 (dim2) float64 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0
9292
* dim3 (dim3) %s 'a' 'b' 'c' 'd' 'e' 'f' 'g' 'h' 'i' 'j'
9393
numbers (dim3) int64 0 1 2 0 0 1 1 2 2 3
94-
o dim1 (dim1) -
94+
Unindexed dimensions:
95+
dim1
9596
Data variables:
9697
var1 (dim1, dim2) float64 -1.086 0.9973 0.283 -1.506 -0.5786 1.651 ...
9798
var2 (dim1, dim2) float64 1.162 -1.097 -2.123 1.04 -0.4034 -0.126 ...
@@ -203,25 +204,25 @@ def test_info(self):
203204
expected = dedent(u'''\
204205
xarray.Dataset {
205206
dimensions:
206-
dim1 = 8 ;
207-
dim2 = 9 ;
208-
dim3 = 10 ;
209-
time = 20 ;
207+
\tdim1 = 8 ;
208+
\tdim2 = 9 ;
209+
\tdim3 = 10 ;
210+
\ttime = 20 ;
210211
211212
variables:
212-
datetime64[ns] time(time) ;
213-
float64 dim2(dim2) ;
214-
float64 var1(dim1, dim2) ;
215-
var1:foo = variable ;
216-
float64 var2(dim1, dim2) ;
217-
var2:foo = variable ;
218-
float64 var3(dim3, dim1) ;
219-
var3:foo = variable ;
220-
int64 numbers(dim3) ;
213+
\tdatetime64[ns] time(time) ;
214+
\tfloat64 dim2(dim2) ;
215+
\tfloat64 var1(dim1, dim2) ;
216+
\t\tvar1:foo = variable ;
217+
\tfloat64 var2(dim1, dim2) ;
218+
\t\tvar2:foo = variable ;
219+
\tfloat64 var3(dim3, dim1) ;
220+
\t\tvar3:foo = variable ;
221+
\tint64 numbers(dim3) ;
221222
222223
// global attributes:
223-
:unicode_attr = ba® ;
224-
:string_attr = bar ;
224+
\t:unicode_attr = ba® ;
225+
\t:string_attr = bar ;
225226
}''')
226227
actual = buf.getvalue()
227228
self.assertEqual(expected, actual)
@@ -685,6 +686,23 @@ def test_coords_merge_mismatched_shape(self):
685686
actual = orig_coords.merge(other_coords)
686687
self.assertDatasetIdentical(expected, actual)
687688

689+
def test_data_vars_properties(self):
690+
ds = Dataset()
691+
ds['foo'] = (('x',), [1.0])
692+
ds['bar'] = 2.0
693+
694+
self.assertEqual(set(ds.data_vars), {'foo', 'bar'})
695+
self.assertIn('foo', ds.data_vars)
696+
self.assertNotIn('x', ds.data_vars)
697+
self.assertDataArrayIdentical(ds['foo'], ds.data_vars['foo'])
698+
699+
expected = dedent("""\
700+
Data variables:
701+
foo (x) float64 1.0
702+
bar float64 2.0""")
703+
actual = repr(ds.data_vars)
704+
self.assertEqual(expected, actual)
705+
688706
def test_equals_and_identical(self):
689707
data = create_test_data(seed=42)
690708
self.assertTrue(data.equals(data))
@@ -3101,7 +3119,7 @@ def test_filter_by_attrs(self):
31013119
ds = Dataset({'temperature_0': (['t'], [0], temp0),
31023120
'temperature_10': (['t'], [0], temp10),
31033121
'precipitation': (['t'], [0], precip)},
3104-
coords={'time': (['t'], [0], dict(axis='T'))})
3122+
coords={'time': (['t'], [0], dict(axis='T'))})
31053123

31063124
# Test return empty Dataset.
31073125
ds.filter_by_attrs(standard_name='invalid_standard_name')

0 commit comments

Comments
 (0)