|
129 | 129 | "> on datasets that don’t fit into memory. Currently, Dask is an entirely optional feature\n",
|
130 | 130 | "> for xarray. However, the benefits of using Dask are sufficiently strong that Dask may\n",
|
131 | 131 | "> become a required dependency in a future version of xarray.\n",
|
132 |
| - ">\n", |
133 |
| - "> — <cite>https://docs.xarray.dev/en/stable/use\n", |
| 132 | + "\n", |
| 133 | + "— <cite>https://docs.xarray.dev/en/stable/use\n", |
134 | 134 | "\n",
|
135 | 135 | "**Which Xarray features support Dask?**\n",
|
136 | 136 | "\n",
|
|
139 | 139 | "> Dask arrays. When you load data as a Dask array in an xarray data structure, almost\n",
|
140 | 140 | "> all xarray operations will keep it as a Dask array; when this is not possible, they\n",
|
141 | 141 | "> will raise an exception rather than unexpectedly loading data into memory.\n",
|
142 |
| - ">\n", |
143 |
| - "> — <cite>https://docs.xarray.dev/en/stable/user-guide/dask.html#using-dask-with-xarray</cite>\n", |
| 142 | + "\n", |
| 143 | + "— <cite>https://docs.xarray.dev/en/stable/user-guide/dask.html#using-dask-with-xarray</cite>\n", |
144 | 144 | "\n",
|
145 | 145 | "**What is the default Dask behavior for distributing work on compute hardware?**\n",
|
146 | 146 | "\n",
|
147 | 147 | "> By default, dask uses its multi-threaded scheduler, which distributes work across\n",
|
148 | 148 | "> multiple cores and allows for processing some datasets that do not fit into memory.\n",
|
149 | 149 | "> For running across a cluster, [setup the distributed scheduler](https://docs.dask.org/en/latest/setup.html).\n",
|
150 |
| - ">\n", |
151 |
| - "> — <cite>https://docs.xarray.dev/en/stable/user-guide/dask.html#using-dask-with-xarray</cite>\n", |
| 150 | + "\n", |
| 151 | + "— <cite>https://docs.xarray.dev/en/stable/user-guide/dask.html#using-dask-with-xarray</cite>\n", |
152 | 152 | "\n",
|
153 | 153 | "**How do I use Dask arrays in an `xarray.Dataset`?**\n",
|
154 | 154 | "\n",
|
|
161 | 161 | "> `open_mfdataset()` called without `chunks` argument will return dask arrays with\n",
|
162 | 162 | "> chunk sizes equal to the individual files. Re-chunking the dataset after creation\n",
|
163 | 163 | "> with `ds.chunk()` will lead to an ineffective use of memory and is not recommended.\n",
|
164 |
| - ">\n", |
165 |
| - "> — <cite>https://docs.xarray.dev/en/stable/user-guide/dask.html#reading-and-writing-data</cite>\n" |
| 164 | + "\n", |
| 165 | + "— <cite>https://docs.xarray.dev/en/stable/user-guide/dask.html#reading-and-writing-data</cite>\n" |
166 | 166 | ]
|
167 | 167 | },
|
168 | 168 | {
|
|
196 | 196 | "> - Array data formats are often chunked as well. When loading or saving data,\n",
|
197 | 197 | "> if is useful to have Dask array chunks that are aligned with the chunking\n",
|
198 | 198 | "> of your storage, often an even multiple times larger in each direction\n",
|
199 |
| - ">\n", |
200 |
| - "> — <cite>https://docs.dask.org/en/latest/array-chunks.html</cite>\n" |
| 199 | + "\n", |
| 200 | + "— <cite>https://docs.dask.org/en/latest/array-chunks.html</cite>\n" |
201 | 201 | ]
|
202 | 202 | },
|
203 | 203 | {
|
|
267 | 267 | ">\n",
|
268 | 268 | "> - **Single-machine scheduler**: This scheduler provides basic features on a local process or thread pool. This scheduler was made first and is the default. It is simple and cheap to use, although it can only be used on a single machine and does not scale\n",
|
269 | 269 | "> - **Distributed scheduler**: This scheduler is more sophisticated, offers more features, but also requires a bit more effort to set up. It can run locally or distributed across a cluster\n",
|
270 |
| - ">\n", |
271 |
| - "> — <cite>https://docs.dask.org/en/stable/scheduling.html</cite>\n" |
| 270 | + "\n", |
| 271 | + "— <cite>https://docs.dask.org/en/stable/scheduling.html</cite>\n" |
272 | 272 | ]
|
273 | 273 | },
|
274 | 274 | {
|
|
0 commit comments