Skip to content

Commit 767a23c

Browse files
authored
Add conda dev environments for Python 3.7/3.8, JDK 8/11 (#238)
* Add conda env files * Add pytest skip for hive tests when Docker images are missing * Make sure to specify Python version * Add pip dependencies to dev environments * Update docs to use dev environment * First pass at updating CI to use conda envs * Use environment variable for conda file location in matrix jobs * Force openjdk reinstall to resolve Windows failures * Split off env files by JDK version * Add sasl to dev envs * Only install sasl on ubuntu * Skip hive/postgres testing on Windows * Attempt to get conda cache working * Remove shell mamba installs * Skip failing tests on Windows * xfail Windows tests instead of skipping * Update references to conda env files * Don't copy env file in Docker image * Attempt to set JAVA_HOME explicitly * Set correct JAVA_HOME only on Windows * No longer need to xfail JAVA_HOME check * Remove unnecessary sys import * Remove env files with unpinned JDK * Caching isn't working - try the old method of installing? * Caching seems to slow down runs - try without? * Remove conda caching from workflows * Make maven caching key more obvious * Keep original conda.txt for Docker purposes
1 parent dcf93ee commit 767a23c

13 files changed

+219
-80
lines changed

.github/workflows/deploy.yml

Lines changed: 3 additions & 7 deletions
Original file line numberDiff line numberDiff line change
@@ -14,22 +14,18 @@ jobs:
1414
uses: actions/cache@v2
1515
with:
1616
path: ~/.m2/repository
17-
key: ${{ runner.os }}-maven-v1-${{ hashFiles('**/pom.xml') }}
18-
- name: Cache downloaded conda packages
19-
uses: actions/cache@v2
20-
with:
21-
path: ~/conda_pkgs_dir
22-
key: ${{ runner.os }}-conda-v3-${{ hashFiles('conda.txt') }}
17+
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
2318
- name: Set up Python
2419
uses: conda-incubator/setup-miniconda@v2
2520
with:
2621
miniforge-variant: Mambaforge
2722
use-mamba: true
2823
python-version: 3.8
24+
activate-environment: dask-sql
25+
environment-file: continuous_integration/environment-3.8-jdk11-dev.yaml
2926
- name: Install dependencies
3027
shell: bash -l {0}
3128
run: |
32-
conda install --file conda.txt -c conda-forge
3329
pip install setuptools wheel twine
3430
which python
3531
pip list

.github/workflows/test.yml

Lines changed: 21 additions & 62 deletions
Original file line numberDiff line numberDiff line change
@@ -21,29 +21,18 @@ jobs:
2121
uses: actions/cache@v2
2222
with:
2323
path: ~/.m2/repository
24-
key: ${{ runner.os }}-maven-v1-${{ hashFiles('**/pom.xml') }}
25-
- name: Cache downloaded conda packages
26-
uses: actions/cache@v2
27-
with:
28-
path: ~/conda_pkgs_dir
29-
key: ${{ runner.os }}-conda-v3-${{ hashFiles('conda.txt') }}
24+
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
3025
- name: Set up Python
3126
uses: conda-incubator/setup-miniconda@v2
3227
with:
3328
miniforge-variant: Mambaforge
3429
use-mamba: true
3530
python-version: 3.8
31+
activate-environment: dask-sql
32+
environment-file: continuous_integration/environment-3.8-jdk11-dev.yaml
3633
- name: Install dependencies and build the jar
3734
shell: bash -l {0}
3835
run: |
39-
mamba install --file conda.txt -c conda-forge
40-
which python
41-
pip list
42-
mamba list
43-
44-
# This needs to happen in the same
45-
# shell, because otherwise the JAVA_HOME
46-
# will be wrong on windows
4736
python setup.py java
4837
- name: Upload the jar
4938
uses: actions/upload-artifact@v1
@@ -55,6 +44,8 @@ jobs:
5544
name: "Test (${{ matrix.os }}, java: ${{ matrix.java }}, python: ${{ matrix.python }})"
5645
needs: build
5746
runs-on: ${{ matrix.os }}
47+
env:
48+
CONDA_FILE: continuous_integration/environment-${{ matrix.python }}-jdk${{ matrix.java }}-dev.yaml
5849
strategy:
5950
matrix:
6051
java: [8, 11]
@@ -66,58 +57,35 @@ jobs:
6657
uses: actions/cache@v2
6758
with:
6859
path: ~/.m2/repository
69-
key: ${{ runner.os }}-maven-v1-${{ matrix.java }}-${{ hashFiles('**/pom.xml') }}
70-
- name: Cache downloaded conda packages
71-
uses: actions/cache@v2
72-
with:
73-
path: ~/conda_pkgs_dir
74-
key: ${{ runner.os }}-conda-v3-${{ matrix.java }}-${{ matrix.python }}-${{ hashFiles('conda.txt') }}
60+
key: ${{ runner.os }}-maven-v1-jdk${{ matrix.java }}-${{ hashFiles('**/pom.xml') }}
7561
- name: Set up Python
7662
uses: conda-incubator/setup-miniconda@v2
7763
with:
7864
miniforge-variant: Mambaforge
7965
use-mamba: true
8066
python-version: ${{ matrix.python }}
67+
activate-environment: dask-sql
68+
environment-file: ${{ env.CONDA_FILE }}
8169
- name: Download the pre-build jar
8270
uses: actions/download-artifact@v1
8371
with:
8472
name: jar
8573
path: dask_sql/jar/
86-
- name: Install dependencies
87-
shell: bash -l {0}
88-
run: |
89-
mamba install python=${{ matrix.python }} --file conda.txt -c conda-forge
90-
- name: Install sqlalchemy and docker pkg for postgres test
74+
- name: Install hive testing dependencies for Linux
9175
shell: bash -l {0}
9276
run: |
93-
# explicitly install docker, fugue and other packages
94-
mamba install \
95-
sasl>=0.3.1 \
96-
sqlalchemy>=1.4.23 \
97-
pyhive>=0.6.4 \
98-
psycopg2>=2.9.1 \
99-
ciso8601>=2.2.0 \
100-
tpot>=0.11.7 \
101-
mlflow>=1.19.0 \
102-
docker-py>=5.0.0 \
103-
-c conda-forge
104-
pip install "fugue[sql]>=0.5.3"
77+
mamba install -c conda-forge sasl>=0.3.1
10578
docker pull bde2020/hive:2.3.2-postgresql-metastore
10679
docker pull bde2020/hive-metastore-postgresql:2.3.0
10780
if: matrix.os == 'ubuntu-latest'
108-
- name: Install Java (again) and test with pytest
81+
- name: Set proper JAVA_HOME for Windows
82+
shell: bash -l {0}
83+
run: |
84+
echo "JAVA_HOME=${{ env.CONDA }}\envs\dask-sql\Library" >> $GITHUB_ENV
85+
if: matrix.os == 'windows-latest'
86+
- name: Test with pytest
10987
shell: bash -l {0}
11088
run: |
111-
mamba install openjdk=${{ matrix.java }}
112-
which python
113-
pip list
114-
mamba list
115-
116-
# This needs to happen in the same
117-
# shell, because otherwise the JAVA_HOME
118-
# will be wrong on windows
119-
# The --dist loadfile makes sure, that tests are distributed according to their file
120-
# this is especially important to not run the docker-fixtures more than once
12189
pytest --junitxml=junit/test-results.xml --cov-report=xml -n auto tests --dist loadfile
12290
- name: Upload pytest test results
12391
uses: actions/upload-artifact@v1
@@ -141,18 +109,15 @@ jobs:
141109
uses: actions/cache@v2
142110
with:
143111
path: ~/.m2/repository
144-
key: ${{ runner.os }}-maven-v1-11-${{ hashFiles('**/pom.xml') }}
145-
- name: Cache downloaded conda packages
146-
uses: actions/cache@v2
147-
with:
148-
path: ~/conda_pkgs_dir
149-
key: ${{ runner.os }}-conda-v3-11-${{ hashFiles('conda.txt') }}
112+
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
150113
- name: Set up Python
151114
uses: conda-incubator/setup-miniconda@v2
152115
with:
153116
miniforge-variant: Mambaforge
154117
use-mamba: true
155118
python-version: 3.8
119+
activate-environment: dask-sql
120+
environment-file: continuous_integration/environment-3.8-jdk11-dev.yaml
156121
- name: Download the pre-build jar
157122
uses: actions/download-artifact@v1
158123
with:
@@ -161,7 +126,7 @@ jobs:
161126
- name: Install dependencies
162127
shell: bash -l {0}
163128
run: |
164-
mamba install python=3.8 python-blosc lz4 --file conda.txt -c conda-forge
129+
mamba install python-blosc lz4 -c conda-forge
165130
166131
which python
167132
pip list
@@ -192,20 +157,14 @@ jobs:
192157
uses: actions/cache@v2
193158
with:
194159
path: ~/.m2/repository
195-
key: ${{ runner.os }}-maven-v1-11-${{ hashFiles('**/pom.xml') }}
196-
- name: Cache downloaded conda packages
197-
uses: actions/cache@v2
198-
with:
199-
path: ~/conda_pkgs_dir
200-
key: ${{ runner.os }}-conda-v2-11-${{ hashFiles('conda.txt') }}
160+
key: ${{ runner.os }}-maven-v1-jdk11-${{ hashFiles('**/pom.xml') }}
201161
- name: Set up Python
202162
uses: conda-incubator/setup-miniconda@v2
203163
with:
204164
python-version: 3.8
205165
mamba-version: "*"
206166
channels: conda-forge,defaults
207167
channel-priority: true
208-
use-only-tar-bz2: true
209168
- name: Download the pre-build jar
210169
uses: actions/download-artifact@v1
211170
with:

README.md

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -113,11 +113,11 @@ If you want to have the newest (unreleased) `dask-sql` version or if you plan to
113113

114114
Create a new conda environment and install the development environment:
115115

116-
conda create -n dask-sql --file conda.txt -c conda-forge
116+
conda env create -f continuous_integration/environment-3.8-jdk11-dev.yaml
117117

118118
It is not recommended to use `pip` instead of `conda` for the environment setup.
119119
If you however need to, make sure to have Java (jdk >= 8) and maven installed and correctly setup before continuing.
120-
Have a look into `conda.txt` for the rest of the development environment.
120+
Have a look into `environment-3.8-jdk11-dev.yaml` for the rest of the development environment.
121121

122122
After that, you can install the package in development mode
123123

Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
name: dask-sql
2+
channels:
3+
- conda-forge
4+
- defaults
5+
dependencies:
6+
- adagio>=0.2.3
7+
- antlr4-python3-runtime>=4.9.2
8+
- black=19.10b0
9+
- ciso8601>=2.2.0
10+
- dask-ml>=1.7.0
11+
- dask>=2.19.0,!=2021.3.0 # dask 2021.3.0 makes dask-ml fail (see https://github.com/dask/dask-ml/issues/803)
12+
- fastapi>=0.61.1
13+
- fs>=2.4.11
14+
- intake>=0.6.0
15+
- isort=5.7.0
16+
- jpype1>=1.0.2
17+
- lightgbm>=3.2.1
18+
- maven>=3.6.0
19+
- mlflow>=1.19.0
20+
- mock>=4.0.3
21+
- nest-asyncio>=1.4.3
22+
- openjdk=11
23+
- pandas>=1.0.0 # below 1.0, there were no nullable ext. types
24+
- pip=20.2.4
25+
- pre-commit>=2.11.1
26+
- prompt_toolkit>=3.0.8
27+
- psycopg2>=2.9.1
28+
- pyarrow>=0.15.1
29+
- pygments>=2.7.1
30+
- pyhive>=0.6.4
31+
- pytest-cov>=2.10.1
32+
- pytest-xdist
33+
- pytest>=6.0.1
34+
- python=3.7
35+
- scikit-learn>=0.24.2
36+
- sphinx>=3.2.1
37+
- tpot>=0.11.7
38+
- triad>=0.5.4
39+
- tzlocal>=2.1
40+
- uvicorn>=0.11.3
41+
- pip:
42+
- fugue[sql]>=0.5.3
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
name: dask-sql
2+
channels:
3+
- conda-forge
4+
- defaults
5+
dependencies:
6+
- adagio>=0.2.3
7+
- antlr4-python3-runtime>=4.9.2
8+
- black=19.10b0
9+
- ciso8601>=2.2.0
10+
- dask-ml>=1.7.0
11+
- dask>=2.19.0,!=2021.3.0 # dask 2021.3.0 makes dask-ml fail (see https://github.com/dask/dask-ml/issues/803)
12+
- fastapi>=0.61.1
13+
- fs>=2.4.11
14+
- intake>=0.6.0
15+
- isort=5.7.0
16+
- jpype1>=1.0.2
17+
- lightgbm>=3.2.1
18+
- maven>=3.6.0
19+
- mlflow>=1.19.0
20+
- mock>=4.0.3
21+
- nest-asyncio>=1.4.3
22+
- openjdk=8
23+
- pandas>=1.0.0 # below 1.0, there were no nullable ext. types
24+
- pip=20.2.4
25+
- pre-commit>=2.11.1
26+
- prompt_toolkit>=3.0.8
27+
- psycopg2>=2.9.1
28+
- pyarrow>=0.15.1
29+
- pygments>=2.7.1
30+
- pyhive>=0.6.4
31+
- pytest-cov>=2.10.1
32+
- pytest-xdist
33+
- pytest>=6.0.1
34+
- python=3.7
35+
- scikit-learn>=0.24.2
36+
- sphinx>=3.2.1
37+
- tpot>=0.11.7
38+
- triad>=0.5.4
39+
- tzlocal>=2.1
40+
- uvicorn>=0.11.3
41+
- pip:
42+
- fugue[sql]>=0.5.3
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
name: dask-sql
2+
channels:
3+
- conda-forge
4+
- defaults
5+
dependencies:
6+
- adagio>=0.2.3
7+
- antlr4-python3-runtime>=4.9.2
8+
- black=19.10b0
9+
- ciso8601>=2.2.0
10+
- dask-ml>=1.7.0
11+
- dask>=2.19.0,!=2021.3.0 # dask 2021.3.0 makes dask-ml fail (see https://github.com/dask/dask-ml/issues/803)
12+
- fastapi>=0.61.1
13+
- fs>=2.4.11
14+
- intake>=0.6.0
15+
- isort=5.7.0
16+
- jpype1>=1.0.2
17+
- lightgbm>=3.2.1
18+
- maven>=3.6.0
19+
- mlflow>=1.19.0
20+
- mock>=4.0.3
21+
- nest-asyncio>=1.4.3
22+
- openjdk=11
23+
- pandas>=1.0.0 # below 1.0, there were no nullable ext. types
24+
- pip=20.2.4
25+
- pre-commit>=2.11.1
26+
- prompt_toolkit>=3.0.8
27+
- psycopg2>=2.9.1
28+
- pyarrow>=0.15.1
29+
- pygments>=2.7.1
30+
- pyhive>=0.6.4
31+
- pytest-cov>=2.10.1
32+
- pytest-xdist
33+
- pytest>=6.0.1
34+
- python=3.8
35+
- scikit-learn>=0.24.2
36+
- sphinx>=3.2.1
37+
- tpot>=0.11.7
38+
- triad>=0.5.4
39+
- tzlocal>=2.1
40+
- uvicorn>=0.11.3
41+
- pip:
42+
- fugue[sql]>=0.5.3
Lines changed: 42 additions & 0 deletions
Original file line numberDiff line numberDiff line change
@@ -0,0 +1,42 @@
1+
name: dask-sql
2+
channels:
3+
- conda-forge
4+
- defaults
5+
dependencies:
6+
- adagio>=0.2.3
7+
- antlr4-python3-runtime>=4.9.2
8+
- black=19.10b0
9+
- ciso8601>=2.2.0
10+
- dask-ml>=1.7.0
11+
- dask>=2.19.0,!=2021.3.0 # dask 2021.3.0 makes dask-ml fail (see https://github.com/dask/dask-ml/issues/803)
12+
- fastapi>=0.61.1
13+
- fs>=2.4.11
14+
- intake>=0.6.0
15+
- isort=5.7.0
16+
- jpype1>=1.0.2
17+
- lightgbm>=3.2.1
18+
- maven>=3.6.0
19+
- mlflow>=1.19.0
20+
- mock>=4.0.3
21+
- nest-asyncio>=1.4.3
22+
- openjdk=8
23+
- pandas>=1.0.0 # below 1.0, there were no nullable ext. types
24+
- pip=20.2.4
25+
- pre-commit>=2.11.1
26+
- prompt_toolkit>=3.0.8
27+
- psycopg2>=2.9.1
28+
- pyarrow>=0.15.1
29+
- pygments>=2.7.1
30+
- pyhive>=0.6.4
31+
- pytest-cov>=2.10.1
32+
- pytest-xdist
33+
- pytest>=6.0.1
34+
- python=3.8
35+
- scikit-learn>=0.24.2
36+
- sphinx>=3.2.1
37+
- tpot>=0.11.7
38+
- triad>=0.5.4
39+
- tzlocal>=2.1
40+
- uvicorn>=0.11.3
41+
- pip:
42+
- fugue[sql]>=0.5.3
File renamed without changes.

docker/main.dockerfile

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -4,7 +4,7 @@ FROM daskdev/dask:latest
44
LABEL author "Nils Braun <[email protected]>"
55

66
# Install dependencies for dask-sql
7-
COPY conda.txt /opt/dask_sql/
7+
COPY docker/conda.txt /opt/dask_sql/
88
RUN conda config --add channels conda-forge \
99
&& /opt/conda/bin/conda install --freeze-installed \
1010
"jpype1>=1.0.2" \

docs/pages/installation.rst

Lines changed: 2 additions & 2 deletions
Original file line numberDiff line numberDiff line change
@@ -57,11 +57,11 @@ Create a new conda environment and install the development environment:
5757

5858
.. code-block:: bash
5959
60-
conda create -n dask-sql --file conda.txt -c conda-forge
60+
conda env create -f continuous_integration/environment-3.8-jdk11-dev.yaml
6161
6262
It is not recommended to use ``pip`` instead of ``conda``.
6363
If you however need to, make sure to have Java (jdk >= 8) and maven installed and correctly setup before continuing.
64-
Have a look into ``conda.txt`` for the rest of the development environment.
64+
Have a look into ``environment-3.8-jdk11-dev.yaml`` for the rest of the development environment.
6565

6666
After that, you can install the package in development mode
6767

0 commit comments

Comments
 (0)