DEV Community

Cover image for Handling Data in Python
Aman Gupta
Aman Gupta

Posted on

Handling Data in Python

In this blog, we are discussing how python handles data, what are various data types, and the data structure in it.

Python has many built-in data types and many specialized data types. we are discussing them one by one here. Let's start with built-in data types:

1. dict

A mapping object maps hashable values to arbitrary objects. Mappings are mutable objects. There is currently only one standard mapping type, the dictionary.
A dictionary’s keys are almost arbitrary values. Values that are not hashable, that is, values containing lists, dictionaries, or other mutable types (that are compared by value rather than by object identity) may not be used as keys. Numeric types used for keys obey the normal rules for numeric comparison: if two numbers compare equal (such as 1 and 1.0) then they can be used interchangeably to index the same dictionary entry. (Note, however, that since computers store floating-point numbers as approximations it is usually unwise to use them as dictionary keys.)

Dictionaries can be created by placing a comma-separated list of key: value pairs within braces.

for example :

{'jack': 4098, 'sjoerd': 4127}
{4098: 'jack', 4127: 'sjoerd'}
Enter fullscreen mode Exit fullscreen mode

Dictionaries can be created by several means:

  1. Use a comma-separated list of key: value pairs within braces.
  2. Use a dict comprehension.
  3. Use the type constructor.

To illustrate, the following examples all return a dictionary equal to {"one": 1, "two": 2, "three": 3}:

a = dict(one=1, two=2, three=3)
b = {'one': 1, 'two': 2, 'three': 3}
c = dict(zip(['one', 'two', 'three'], [1, 2, 3]))
d = dict([('two', 2), ('one', 1), ('three', 3)])
e = dict({'three': 3, 'one': 1, 'two': 2})
f = dict({'one': 1, 'three': 3}, two=2)
Enter fullscreen mode Exit fullscreen mode

2. list

Lists are mutable sequences, typically used to store collections of homogeneous items (where the precise degree of similarity will vary by application).

The constructor builds a list whose items are the same and in the same order as iterable’s items. iterable may be either a sequence, a container that supports iteration, or an iterator object. If iterable is already a list, a copy is made and returned, similar to iterable[:]. For example, list('abc') returns ['a', 'b', 'c'] and list( (1, 2, 3) ) returns [1, 2, 3]. If no argument is given, the constructor creates a new empty list, [].

Lists may be constructed in several ways:

  1. Using a pair of square brackets to denote the empty list.
  2. Using square brackets, separating items with commas.
  3. Using a list comprehension.
  4. Using the type constructor.

for example :

a = []
b = [1,2,3]
c = [x for x in iterable]
d = list('abc')
e = list((1,2,3))
Enter fullscreen mode Exit fullscreen mode

3. set and frozenset

A set object is an unordered collection of distinct hashable objects. Common uses include membership testing, removing duplicates from a sequence, and computing mathematical operations such as intersection, union, difference, and symmetric difference.
There are currently two built-in set types, set and frozenset.

  • The set type is mutable — the contents can be changed using methods like add() and remove(). Since it is mutable, it has no hash value and cannot be used as either a dictionary key or as an element of another set.
  • The frozenset type is immutable and hashable — its contents cannot be altered after it is created; it can therefore be used as a dictionary key or as an element of another set.

Sets can be created by several means:

  • Use a comma-separated list of elements within braces
  • Use a set comprehension
  • Use the type constructor
a = {'jack', 'sjoerd'}
b = {c for c in 'abracadabra' if c not in 'abc'}
c = set()
d = set('foobar')
e = set(['a', 'b', 'foo'])
Enter fullscreen mode Exit fullscreen mode

4. tuple

Tuples are immutable sequences, typically used to store collections of heterogeneous data (such as the 2-tuples produced by the enumerate() built-in). Tuples are also used for cases where an immutable sequence of homogeneous data is needed (such as allowing storage in a set or dict instance).

Tuples may be constructed in a number of ways:

  • Using a pair of parentheses to denote the empty tuple
  • Using a trailing comma for a singleton tuple
  • Separating items with commas
  • Using the tuple() built-in:

The constructor builds a tuple whose items are the same and in the same order as iterable’s items. iterable may be either a sequence, a container that supports iteration, or an iterator object.

a = ()
b = ('a', )
c = ('a', 'b', 'c')
d = tuple() #return empty tuple
Enter fullscreen mode Exit fullscreen mode

5. str

Textual data in Python is handled with str objects, or strings. Strings are immutable sequences of Unicode code points. String literals are written in a variety of ways:

  • Single quotes: 'allows embedded "double" quotes'

  • Double quotes: "allows embedded 'single' quotes".

  • Triple quoted: '''Three single quotes''', """Three double quotes"""

a = 'Aman'
b = "Aman"
c = '''I love python'''
d = """ I am enjoying it """
Enter fullscreen mode Exit fullscreen mode

Triple quoted strings may span multiple lines - all associated whitespace will be included in the string literal.

6. bytes or

Bytes objects are immutable sequences of single bytes. Since many major binary protocols are based on the ASCII text encoding, bytes objects offer several methods that are only valid when working with ASCII compatible data and are closely related to string objects in a variety of other ways.

Firstly, the syntax for bytes literals is largely the same as that for string literals, except that a b prefix is added:

  • Single quotes.

  • Double quotes.

  • Triple quoted.

a = b'still allows embedded "double" quotes'
b = b"still allows embedded 'single' quotes"
c = b'''3 single quotes'''
d = b"""3 double quotes"""
Enter fullscreen mode Exit fullscreen mode

Only ASCII characters are permitted in bytes literals (regardless of the declared source code encoding). Any binary values over 127 must be entered into bytes literals using the appropriate escape sequence.

While bytes literals and representations are based on ASCII text, bytes objects actually behave like immutable sequences of integers, with each value in the sequence restricted such that 0 <= x < 256

7. bytearray

bytearray objects are a mutable counterpart to bytes objects.

As bytearray objects are mutable, they support the mutable sequence operations in addition to the common bytes and bytearray operations

Since 2 hexadecimal digits correspond precisely to a single byte, hexadecimal numbers are a commonly used format for describing binary data. Accordingly, the bytearray type has an additional class method to read data in that format

There is no dedicated literal syntax for bytearray objects, instead they are always created by calling the constructor:

  • Creating an empty instance
  • Creating a zero-filled instance with a given length
  • From an iterable of integers
  • Copying existing binary data via the buffer protocol
a = bytearray()
b = bytearray(10)
c = bytearray(range(20))
d = bytearray(b'Hi!')
Enter fullscreen mode Exit fullscreen mode

Above we see the built-in data-types in python but there are some specialized data types available in python which are also amazing .

8. datetime

The datetime module supplies classes for manipulating dates and times.

While date and time arithmetic is supported, the focus of the implementation is on efficient attribute extraction for output formatting and manipulation.

Available Types

  1. : An idealized naive date, assuming the current Gregorian calendar always was, and always will be, in effect.
  2. datetime.time : An idealized time, independent of any particular day, assuming that every day has exactly 24*60*60 seconds.
  3. datetime.datetime : A combination of a date and a time.
  4. datetime.timedelta : A duration expressing the difference between two date, time, or datetime instances to microsecond resolution.
  5. datetime.tzinfo : An abstract base class for time zone information objects. These are used by the datetime and time classes to provide a customizable notion of time adjustment
  6. datetime.timezone : A class that implements the tzinfo abstract base class as a fixed offset from the UTC.

9. zoneinfo

The zoneinfo module provides a concrete time zone implementation to support the IANA time zone database as originally specified in PEP 615. By default, zoneinfo uses the system’s time zone data if available; if no system time zone data is available, the library will fall back to using the first-party tzdata package available on PyPI.

ZoneInfo is a concrete implementation of the datetime.tzinfo abstract base class, and is intended to be attached to tzinfo, either via the constructor, the datetime.replace method or datetime.astimezone.
For example :

>>> from zoneinfo import ZoneInfo
>>> from datetime import datetime, timedelta

>>> dt = datetime(2020, 10, 31, 12, tzinfo=ZoneInfo("America/Los_Angeles"))
>>> print(dt)
2020-10-31 12:00:00-07:00

>>> dt.tzname()
Enter fullscreen mode Exit fullscreen mode

10. Calendar

This module allows you to output calendars like the Unix cal program, and provides additional useful functions related to the calendar. By default, these calendars have Monday as the first day of the week, and Sunday as the last (the European convention). Use setfirstweekday() to set the first day of the week to Sunday (6) or to any other weekday.

There are also some types available in calendar.

  1. calendar.Calendar : Creates a Calendar object.
  2. calendar.TextCalendar : This class can be used to generate plain text calendars.
  3. calendar.HTMLCalendar : This class can be used to generate HTML calendars.

There are also some more data-types in python which are much advance like Collections, heapq, bisect, array etc.

Thanks for reading.

Top comments (0)