8th day of python challenges 111-117
This commit is contained in:
@@ -0,0 +1,97 @@
|
||||
Metadata-Version: 2.1
|
||||
Name: pandas
|
||||
Version: 0.25.0
|
||||
Summary: Powerful data structures for data analysis, time series, and statistics
|
||||
Home-page: http://pandas.pydata.org
|
||||
Maintainer: The PyData Development Team
|
||||
Maintainer-email: pydata@googlegroups.com
|
||||
License: BSD
|
||||
Project-URL: Bug Tracker, https://github.com/pandas-dev/pandas/issues
|
||||
Project-URL: Documentation, http://pandas.pydata.org/pandas-docs/stable/
|
||||
Project-URL: Source Code, https://github.com/pandas-dev/pandas
|
||||
Platform: any
|
||||
Classifier: Development Status :: 5 - Production/Stable
|
||||
Classifier: Environment :: Console
|
||||
Classifier: Operating System :: OS Independent
|
||||
Classifier: Intended Audience :: Science/Research
|
||||
Classifier: Programming Language :: Python
|
||||
Classifier: Programming Language :: Python :: 3
|
||||
Classifier: Programming Language :: Python :: 3.5
|
||||
Classifier: Programming Language :: Python :: 3.6
|
||||
Classifier: Programming Language :: Python :: 3.7
|
||||
Classifier: Programming Language :: Cython
|
||||
Classifier: Topic :: Scientific/Engineering
|
||||
Requires-Python: >=3.5.3
|
||||
Provides-Extra: test
|
||||
Requires-Dist: python-dateutil (>=2.6.1)
|
||||
Requires-Dist: pytz (>=2017.2)
|
||||
Requires-Dist: numpy (>=1.13.3)
|
||||
Provides-Extra: test
|
||||
Requires-Dist: pytest (>=4.0.2); extra == 'test'
|
||||
Requires-Dist: pytest-xdist; extra == 'test'
|
||||
Requires-Dist: hypothesis (>=3.58); extra == 'test'
|
||||
|
||||
|
||||
**pandas** is a Python package providing fast, flexible, and expressive data
|
||||
structures designed to make working with structured (tabular, multidimensional,
|
||||
potentially heterogeneous) and time series data both easy and intuitive. It
|
||||
aims to be the fundamental high-level building block for doing practical,
|
||||
**real world** data analysis in Python. Additionally, it has the broader goal
|
||||
of becoming **the most powerful and flexible open source data analysis /
|
||||
manipulation tool available in any language**. It is already well on its way
|
||||
toward this goal.
|
||||
|
||||
pandas is well suited for many different kinds of data:
|
||||
|
||||
- Tabular data with heterogeneously-typed columns, as in an SQL table or
|
||||
Excel spreadsheet
|
||||
- Ordered and unordered (not necessarily fixed-frequency) time series data.
|
||||
- Arbitrary matrix data (homogeneously typed or heterogeneous) with row and
|
||||
column labels
|
||||
- Any other form of observational / statistical data sets. The data actually
|
||||
need not be labeled at all to be placed into a pandas data structure
|
||||
|
||||
The two primary data structures of pandas, Series (1-dimensional) and DataFrame
|
||||
(2-dimensional), handle the vast majority of typical use cases in finance,
|
||||
statistics, social science, and many areas of engineering. For R users,
|
||||
DataFrame provides everything that R's ``data.frame`` provides and much
|
||||
more. pandas is built on top of `NumPy <http://www.numpy.org>`__ and is
|
||||
intended to integrate well within a scientific computing environment with many
|
||||
other 3rd party libraries.
|
||||
|
||||
Here are just a few of the things that pandas does well:
|
||||
|
||||
- Easy handling of **missing data** (represented as NaN) in floating point as
|
||||
well as non-floating point data
|
||||
- Size mutability: columns can be **inserted and deleted** from DataFrame and
|
||||
higher dimensional objects
|
||||
- Automatic and explicit **data alignment**: objects can be explicitly
|
||||
aligned to a set of labels, or the user can simply ignore the labels and
|
||||
let `Series`, `DataFrame`, etc. automatically align the data for you in
|
||||
computations
|
||||
- Powerful, flexible **group by** functionality to perform
|
||||
split-apply-combine operations on data sets, for both aggregating and
|
||||
transforming data
|
||||
- Make it **easy to convert** ragged, differently-indexed data in other
|
||||
Python and NumPy data structures into DataFrame objects
|
||||
- Intelligent label-based **slicing**, **fancy indexing**, and **subsetting**
|
||||
of large data sets
|
||||
- Intuitive **merging** and **joining** data sets
|
||||
- Flexible **reshaping** and pivoting of data sets
|
||||
- **Hierarchical** labeling of axes (possible to have multiple labels per
|
||||
tick)
|
||||
- Robust IO tools for loading data from **flat files** (CSV and delimited),
|
||||
Excel files, databases, and saving / loading data from the ultrafast **HDF5
|
||||
format**
|
||||
- **Time series**-specific functionality: date range generation and frequency
|
||||
conversion, moving window statistics, moving window linear regressions,
|
||||
date shifting and lagging, etc.
|
||||
|
||||
Many of these principles are here to address the shortcomings frequently
|
||||
experienced using other languages / scientific research environments. For data
|
||||
scientists, working with data is typically divided into multiple stages:
|
||||
munging and cleaning data, analyzing / modeling it, then organizing the results
|
||||
of the analysis into a form suitable for plotting or tabular display. pandas is
|
||||
the ideal tool for all of these tasks.
|
||||
|
||||
|
Reference in New Issue
Block a user