diff --git a/doc/source/getting_started/overview.rst b/doc/source/getting_started/overview.rst index d8a40c5406dee..032ba73a7293d 100644 --- a/doc/source/getting_started/overview.rst +++ b/doc/source/getting_started/overview.rst @@ -9,9 +9,9 @@ Package overview **pandas** is a `Python `__ package providing fast, flexible, and expressive data structures designed to make working with "relational" or "labeled" data both easy and intuitive. It aims to be the -fundamental high-level building block for doing practical, **real world** data +fundamental high-level building block for doing practical, **real-world** data analysis in Python. Additionally, it has the broader goal of becoming **the -most powerful and flexible open source data analysis / manipulation tool +most powerful and flexible open source data analysis/manipulation tool available in any language**. It is already well on its way toward this goal. pandas is well suited for many different kinds of data: @@ -21,7 +21,7 @@ pandas is well suited for many different kinds of data: - Ordered and unordered (not necessarily fixed-frequency) time series data. - Arbitrary matrix data (homogeneously typed or heterogeneous) with row and column labels - - Any other form of observational / statistical data sets. The data actually + - Any other form of observational / statistical data sets. The data need not be labeled at all to be placed into a pandas data structure The two primary data structures of pandas, :class:`Series` (1-dimensional) @@ -57,7 +57,7 @@ Here are just a few of the things that pandas does well: Excel files, databases, and saving / loading data from the ultrafast **HDF5 format** - **Time series**-specific functionality: date range generation and frequency - conversion, moving window statistics, date shifting and lagging. + conversion, moving window statistics, date shifting, and lagging. Many of these principles are here to address the shortcomings frequently experienced using other languages / scientific research environments. For data @@ -101,12 +101,12 @@ fashion. Also, we would like sensible default behaviors for the common API functions which take into account the typical orientation of time series and -cross-sectional data sets. When using ndarrays to store 2- and 3-dimensional +cross-sectional data sets. When using the N-dimensional array (ndarrays) to store 2- and 3-dimensional data, a burden is placed on the user to consider the orientation of the data set when writing functions; axes are considered more or less equivalent (except when C- or Fortran-contiguousness matters for performance). In pandas, the axes are intended to lend more semantic meaning to the data; i.e., for a particular -data set there is likely to be a "right" way to orient the data. The goal, +data set, there is likely to be a "right" way to orient the data. The goal, then, is to reduce the amount of mental effort required to code up data transformations in downstream functions. @@ -148,8 +148,8 @@ pandas possible. Thanks to `all of our contributors `. pandas is a `NumFOCUS `__ sponsored project. -This will help ensure the success of development of pandas as a world-class open-source -project, and makes it possible to `donate `__ to the project. +This will help ensure the success of the development of pandas as a world-class open-source +project and makes it possible to `donate `__ to the project. Project governance ------------------