Skip to content

Commit 0a1f5ab

Browse files
dongjoon-hyunHyukjinKwon
authored andcommitted
[SPARK-48710][PYTHON][FOLLOWUP] PySpark rdd test should not fail on optional dependencies
### What changes were proposed in this pull request? This is a follow-up of #47083 to recover PySpark RDD tests. ### Why are the changes needed? `PySpark Core` test should not fail on optional dependencies. **BEFORE** ``` $ python/run-tests.py --python-executables python3 --modules pyspark-core ... File "/Users/dongjoon/APACHE/spark-merge/python/pyspark/core/rdd.py", line 5376, in _test import numpy as np ModuleNotFoundError: No module named 'numpy' ``` **AFTER** ``` $ python/run-tests.py --python-executables python3 --modules pyspark-core ... Tests passed in 189 seconds Skipped tests in pyspark.tests.test_memory_profiler with python3: test_assert_vanilla_mode (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_assert_vanilla_mode) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_aggregate_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_aggregate_in_pandas) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_clear (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_clear) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_cogroup_apply_in_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_cogroup_apply_in_arrow) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_cogroup_apply_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_cogroup_apply_in_pandas) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_group_apply_in_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_group_apply_in_arrow) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_group_apply_in_pandas (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_group_apply_in_pandas) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_map_in_pandas_not_supported (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_map_in_pandas_not_supported) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_pandas_udf (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_pandas_udf_iterator_not_supported (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf_iterator_not_supported) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_pandas_udf_window (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_pandas_udf_window) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_udf (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_udf) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_udf_multiple_actions (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_udf_multiple_actions) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_udf_registered (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_udf_registered) ... skipped 'Must have memory-profiler installed.' test_memory_profiler_udf_with_arrow (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_memory_profiler_udf_with_arrow) ... skipped 'Must have memory-profiler installed.' test_profilers_clear (pyspark.tests.test_memory_profiler.MemoryProfiler2Tests.test_profilers_clear) ... skipped 'Must have memory-profiler installed.' test_code_map (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_code_map) ... skipped 'Must have memory-profiler installed.' test_memory_profiler (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_memory_profiler) ... skipped 'Must have memory-profiler installed.' test_profile_pandas_function_api (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_profile_pandas_function_api) ... skipped 'Must have memory-profiler installed.' test_profile_pandas_udf (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_profile_pandas_udf) ... skipped 'Must have memory-profiler installed.' test_udf_line_profiler (pyspark.tests.test_memory_profiler.MemoryProfilerTests.test_udf_line_profiler) ... skipped 'Must have memory-profiler installed.' Skipped tests in pyspark.tests.test_rdd with python3: test_take_on_jrdd_with_large_rows_should_not_cause_deadlock (pyspark.tests.test_rdd.RDDTests.test_take_on_jrdd_with_large_rows_should_not_cause_deadlock) ... skipped 'NumPy or Pandas not installed' Skipped tests in pyspark.tests.test_serializers with python3: test_statcounter_array (pyspark.tests.test_serializers.NumPyTests.test_statcounter_array) ... skipped 'NumPy not installed' test_serialize (pyspark.tests.test_serializers.SciPyTests.test_serialize) ... skipped 'SciPy not installed' Skipped tests in pyspark.tests.test_worker with python3: test_memory_limit (pyspark.tests.test_worker.WorkerMemoryTest.test_memory_limit) ... skipped "Memory limit feature in Python worker is dependent on Python's 'resource' module on Linux; however, not found or not on Linux." test_python_segfault (pyspark.tests.test_worker.WorkerSegfaultNonDaemonTest.test_python_segfault) ... skipped 'SPARK-46130: Flaky with Python 3.12' test_python_segfault (pyspark.tests.test_worker.WorkerSegfaultTest.test_python_segfault) ... skipped 'SPARK-46130: Flaky with Python 3.12' ``` ### Does this PR introduce _any_ user-facing change? No. The failure happens during testing. ### How was this patch tested? Pass the CIs and do the manual test without optional dependencies. ### Was this patch authored or co-authored using generative AI tooling? No. Closes #47526 from dongjoon-hyun/SPARK-48710. Authored-by: Dongjoon Hyun <[email protected]> Signed-off-by: Hyukjin Kwon <[email protected]>
1 parent 0b48d3f commit 0a1f5ab

File tree

1 file changed

+1
-1
lines changed

1 file changed

+1
-1
lines changed

python/pyspark/core/rdd.py

Lines changed: 1 addition & 1 deletion
Original file line numberDiff line numberDiff line change
@@ -5379,7 +5379,7 @@ def _test() -> None:
53795379
if Version(np.__version__) >= Version("2"):
53805380
# `legacy="1.25"` only available in `nump>=2`
53815381
np.set_printoptions(legacy="1.25") # type: ignore[arg-type]
5382-
except TypeError:
5382+
except (ModuleNotFoundError, TypeError):
53835383
pass
53845384

53855385
tmp_dir = tempfile.TemporaryDirectory()

0 commit comments

Comments
 (0)