DEV Community

Cover image for In One Minute : Pandas
Rakesh KR
Rakesh KR

Posted on

4

In One Minute : Pandas

Pandas is a Python library for PANel DAta manipulation and analysis, example: multidimensional time series and cross-sectional data sets commonly found in statistics, experimental science results, econometrics, or finance.
Pandas is implemented primarily using NumPy and Cython; it is intended to be able to integrate very easily with NumPy-based scientific libraries, such as statsmodels.

Pandas is one of the main data science libraries in Python.

Pandas allows importing data from various file formats such as comma-separated values, JSON, Parquet, SQL database tables or queries, and Microsoft Excel.
Pandas allows various data manipulation operations such as merging, reshaping, selecting, as well as data cleaning, and data wrangling features.

Main Features:

  • Data structures: for one- and two-dimensional labeled datasets (respectively Series and DataFrames). Some of their main features include:
  1. Automatically aligning data and interpolation
  2. Handling missing observations in calculations
  3. Convenient slicing and reshaping ("reindexing") functions
  4. Categorical data types
  5. Provide 'group by' aggregation or transformation functionality
  6. Tools for merging and joining together data sets
  7. Simple Matplotlib integration for plotting and graphing
  8. Multi-Indexing providing structure to indices that allow for representation of an arbitrary number of dimensions.
  • Date tools: objects for expressing date offsets or generating date ranges. Dates can be aligned to a specific time zone and converted or compared at will
  • Statistical models: convenient ordinary least squares and panel OLS implementations for in-sample or rolling time series and cross-sectional regressions. These will hopefully be the starting point for implementing models
  • Intelligent Cython offloading; complex computations are performed rapidly due to these optimizations.
  • Static and moving statistical tools: mean, standard deviation, correlation, and covariance
  • Rich User Documentation, using Sphinx

Resources and Tutorials:

Books:

Image of Timescale

🚀 pgai Vectorizer: SQLAlchemy and LiteLLM Make Vector Search Simple

We built pgai Vectorizer to simplify embedding management for AI applications—without needing a separate database or complex infrastructure. Since launch, developers have created over 3,000 vectorizers on Timescale Cloud, with many more self-hosted.

Read more

Top comments (2)

Collapse
 
muna_mohammed profile image
Muna Mohammed

I really like it, it was written as one minute, but I think it is better than a lot of other articles. I like the resources part more than anything, thank you very much.

Collapse
 
rakeshkr2 profile image
Rakesh KR

Thank you, gald you like the post 💗

Image of Docusign

🛠️ Bring your solution into Docusign. Reach over 1.6M customers.

Docusign is now extensible. Overcome challenges with disconnected products and inaccessible data by bringing your solutions into Docusign and publishing to 1.6M customers in the App Center.

Learn more