-
-
Notifications
You must be signed in to change notification settings - Fork 18.5k
Contributing Guide for Type Hints #27050
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Changes from all commits
049d7c3
ea8e560
cc7cd4e
1cce394
245e0a6
0618be6
1c2a8ca
3720205
abb22c4
94a7a5d
1ad0cb1
a344f56
b0a180b
1fa2b22
01fa88b
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -710,6 +710,136 @@ You'll also need to | |
|
||
See :ref:`contributing.warnings` for more. | ||
|
||
.. _contributing.type_hints: | ||
|
||
Type Hints | ||
---------- | ||
|
||
*pandas* strongly encourages the use of :pep:`484` style type hints. New development should contain type hints and pull requests to annotate existing code are accepted as well! | ||
|
||
Style Guidelines | ||
~~~~~~~~~~~~~~~~ | ||
|
||
Types imports should follow the ``from typing import ...`` convention. So rather than | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we could add a codecheck for this? |
||
|
||
.. code-block:: python | ||
|
||
import typing | ||
|
||
primes = [] # type: typing.List[int] | ||
|
||
You should write | ||
|
||
.. code-block:: python | ||
|
||
from typing import List, Optional, Union | ||
|
||
primes = [] # type: List[int] | ||
|
||
``Optional`` should be used where applicable, so instead of | ||
|
||
.. code-block:: python | ||
|
||
maybe_primes = [] # type: List[Union[int, None]] | ||
|
||
You should write | ||
|
||
.. code-block:: python | ||
|
||
maybe_primes = [] # type: List[Optional[int]] | ||
|
||
In some cases in the code base classes may define class variables that shadow builtins. This causes an issue as described in `Mypy 1775 <https://github.com/python/mypy/issues/1775#issuecomment-310969854>`_. The defensive solution here is to create an unambiguous alias of the builtin and use that without your annotation. For example, if you come across a definition like | ||
|
||
.. code-block:: python | ||
|
||
class SomeClass1: | ||
str = None | ||
|
||
The appropriate way to annotate this would be as follows | ||
|
||
.. code-block:: python | ||
|
||
str_type = str | ||
|
||
class SomeClass2: | ||
str = None # type: str_type | ||
|
||
In some cases you may be tempted to use ``cast`` from the typing module when you know better than the analyzer. This occurs particularly when using custom inference functions. For example | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche let me know if this helps with understanding of cast There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Is it also an option to just "leave it" for some cases? Or does mypy error if it cannot infer a type? Because I have the feeling we have a lot of such cases, we use those There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Not sure I understand what you mean by “leave it” but once the annotation is added in the signature here any op which assumes a string but which Mypy can’t narrow inference down to will raise (here it would say something like int/float has no attribute “upper”) For sure though I think we will have a few places in the code base where cast would be required, at least unless the referenced Mypy enhancement is implemented There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yes, I basically meant "leave it untyped" (as you can also leave complete functions untyped). So but you answered: once you type the signature of a function, mypy needs to understand the full body of that function.
I don't have the feeling it is "a few". We do such
There doesn't seem to be much movement in that issue at the moment? |
||
|
||
.. code-block:: python | ||
|
||
from typing import cast | ||
|
||
from pandas.core.dtypes.common import is_number | ||
|
||
def cannot_infer_bad(obj: Union[str, int, float]): | ||
|
||
if is_number(obj): | ||
... | ||
else: # Reasonably only str objects would reach this but... | ||
obj = cast(str, obj) # Mypy complains without this! | ||
return obj.upper() | ||
|
||
The limitation here is that while a human can reasonably understand that ``is_number`` would catch the ``int`` and ``float`` types mypy cannot make that same inference just yet (see `mypy #5206 <https://github.com/python/mypy/issues/5206>`_. While the above works, the use of ``cast`` is **strongly discouraged**. Where applicable a refactor of the code to appease static analysis is preferable | ||
|
||
.. code-block:: python | ||
|
||
def cannot_infer_good(obj: Union[str, int, float]): | ||
|
||
if isinstance(obj, str): | ||
return obj.upper() | ||
else: | ||
... | ||
|
||
With custom types and inference this is not always possible so exceptions are made, but every effort should be exhausted to avoid ``cast`` before going down such paths. | ||
|
||
Syntax Requirements | ||
~~~~~~~~~~~~~~~~~~~ | ||
|
||
Because *pandas* still supports Python 3.5, :pep:`526` does not apply and variables **must** be annotated with type comments. Specifically, this is a valid annotation within pandas: | ||
|
||
.. code-block:: python | ||
|
||
primes = [] # type: List[int] | ||
|
||
Whereas this is **NOT** allowed: | ||
|
||
.. code-block:: python | ||
|
||
primes: List[int] = [] # not supported in Python 3.5! | ||
|
||
Note that function signatures can always be annotated per :pep:`3107`: | ||
|
||
.. code-block:: python | ||
|
||
def sum_of_primes(primes: List[int] = []) -> int: | ||
... | ||
|
||
|
||
Pandas-specific Types | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
Commonly used types specific to *pandas* will appear in `pandas._typing <https://github.com/pandas-dev/pandas/blob/master/pandas/_typing.py>`_ and you should use these where applicable. This module is private for now but ultimately this should be exposed to third party libraries who want to implement type checking against pandas. | ||
|
||
For example, quite a few functions in *pandas* accept a ``dtype`` argument. This can be expressed as a string like ``"object"``, a ``numpy.dtype`` like ``np.int64`` or even a pandas ``ExtensionDtype`` like ``pd.CategoricalDtype``. Rather than burden the user with having to constantly annotate all of those options, this can simply be imported and reused from the pandas._typing module | ||
|
||
.. code-block:: python | ||
|
||
from pandas._typing import Dtype | ||
|
||
def as_type(dtype: Dtype) -> ...: | ||
... | ||
|
||
This module will ultimately house types for repeatedly used concepts like "path-like", "array-like", "numeric", etc... and can also hold aliases for commonly appearing parameters like `axis`. Development of this module is active so be sure to refer to the source for the most up to date list of available types. | ||
|
||
Validating Type Hints | ||
~~~~~~~~~~~~~~~~~~~~~ | ||
|
||
*pandas* uses `mypy <http://mypy-lang.org>`_ to statically analyze the code base and type hints. After making any change you can ensure your type hints are correct by running | ||
|
||
.. code-block:: shell | ||
|
||
mypy pandas | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Can you also type check single files or submodules? (for quicker development turnover, if you are trying out type checking whole pandas takes a while) There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. You could but not generically useful as mypy doggedly follows all imports so wouldn't necessarily save much time: https://mypy.readthedocs.io/en/latest/running_mypy.html#following-imports If type checking speed is a concern the suggested approach is to use a daemon: https://mypy.readthedocs.io/en/latest/mypy_daemon.html#mypy-daemon-mypy-server There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @jorisvandenbossche yes you can do something like this: There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I don’t plan on adding this - it’s not value added to do There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Well that's true, someone can easily figure out the command for a single file/folder by making a wise guess or going to mypy docs. |
||
|
||
.. _contributing.ci: | ||
|
||
|
Uh oh!
There was an error while loading. Please reload this page.