DEV Community

Cover image for Python Developer Track for Oracle JSON and Duality Views - Part 5: Indexing
Javier
Javier

Posted on

Python Developer Track for Oracle JSON and Duality Views - Part 5: Indexing

In this tutorial we are going to learn about indexing. We can create different indexes with different purposes:

  1. Create an index to improve query performance
  2. Create a search index to search for keywords
  3. Create an index to get a Data Guide, or get the JSON structure

Let's go with the first index. As usual, first we are going to connect to the Oracle Database 23c and open our collection "hotel_reservations":

import oracledb
import json

#connect with thick-client
oracledb.init_oracle_client()
connection = oracledb.connect(user="myapp",
                              password="PassworD123##",
                              dsn="localhost:1521/FREEPDB1")
#auto commit
connection.autocommit = True

#Open collection for hotel reservations
soda = connection.getSodaDatabase()
collection = soda.openCollection("hotel_reservations")
Enter fullscreen mode Exit fullscreen mode

Now we have to define the index. In this case I want to create a composite index, as I would like to search by the reservation_id and the room_id. We are going to call the index "reservation_index". Then we will call the function "createIndex":

index_def = {
    'name': 'RESERVATION_INDEX',
    'fields': [
        {
            'path': 'reservation_id',
            'datatype': 'string',
            'order': 'asc'
        },
        {
            'path': 'room_id',
            'datatype': 'string',
            'order': 'asc'
        },

    ]
}
collection.createIndex(index_def)
Enter fullscreen mode Exit fullscreen mode

Easy, isn't it? Now let's create our second index. Let me show you the code first:

index_search_def ={ 
    "name"      : "SEARCH_AND_DATA_GUIDE_IDX",
    "dataguide" : "on",
    "search_on" : "text" }

collection.createIndex(index_search_def)
Enter fullscreen mode Exit fullscreen mode

In this case, the index is called "search_and_data_guide_idx". As the name describes, it has double purpose. There is a first parameter which is called "dataguide". Data guide provides a summary of the structural and type information about our JSON documents. Later on we will have a look into it. This parameter can be on or off.

The second parameter is "search_on". This parameter can have two values: text or text_value. If your search index involve only text search or string-equality search you can use text. If you are interested in numeric ranges as well as text, you have to use text_value.

Now that we have defined the index, let's do a query where we are looking for bookings that are talking about checking, because the customer had added a comment into the booking:

documents = collection.find().filter({"$textContains" : "check%"}).getDocuments()
print('\n Found the following documents talking about check:')
for d in documents:
    content = d.getContent()
    print(content)
Enter fullscreen mode Exit fullscreen mode

We can see we have a reservation with the following comment: Please arrange for a 2pm checkout.

As promised before, let's have a look at the Data Guide. One of the great benefits of JSON is the flexibility, and could be also one of the problems! If we want to know if there are any document which is not following the expected structure or if I want to know how it is, we can get this data guide. Let's have a look:

data_guide= collection.getDataGuide().getContent()
print(json.dumps(data_guide, indent=1))
Enter fullscreen mode Exit fullscreen mode

You should get the following JSON document. We can get all the info about number of fields, length and preferred column name.

>>> print(json.dumps(data_guide, indent=1))
{
 "type": "object",
 "o:length": 1,
 "properties": {
  "_id": {
   "type": "string",
   "o:length": 32,
   "o:preferred_column_name": "DATA$_id"
  },
  "room_id": {
   "type": "string",
   "o:length": 4,
   "o:preferred_column_name": "DATA$room_id"
  },
  "hotel_id": {
   "type": "string",
   "o:length": 4,
   "o:preferred_column_name": "DATA$hotel_id"
  },
  "guest_name": {
   "type": "object",
   "o:length": 1,
   "o:preferred_column_name": "DATA$guest_name",
   "properties": {
    "last_name": {
     "type": "string",
     "o:length": 16,
     "o:preferred_column_name": "DATA$last_name"
    },
    "first_name": {
     "type": "string",
     "o:length": 8,
     "o:preferred_column_name": "DATA$first_name"
    }
   }
  },
  "num_adults": {
   "type": "number",
   "o:length": 1,
   "o:preferred_column_name": "DATA$num_adults"
  },
  "total_cost": {
   "type": "number",
   "o:length": 4,
   "o:preferred_column_name": "DATA$total_cost"
  },
  "checkin_date": {
   "type": "string",
   "o:length": 16,
   "o:preferred_column_name": "DATA$checkin_date"
  },
  "num_children": {
   "type": "number",
   "o:length": 1,
   "o:preferred_column_name": "DATA$num_children"
  },
  "checkout_date": {
   "type": "string",
   "o:length": 16,
   "o:preferred_column_name": "DATA$checkout_date"
  },
  "payment_status": {
   "type": "string",
   "o:length": 4,
   "o:preferred_column_name": "DATA$payment_status"
  },
  "reservation_id": {
   "type": "string",
   "o:length": 1,
   "o:preferred_column_name": "DATA$reservation_id"
  },
  "guest_contact_info": {
   "type": "object",
   "o:length": 1,
   "o:preferred_column_name": "DATA$guest_contact_info",
   "properties": {
    "email": {
     "type": "string",
     "o:length": 32,
     "o:preferred_column_name": "DATA$email"
    },
    "phone": {
     "type": "string",
     "o:length": 8,
     "o:preferred_column_name": "DATA$phone"
    },
    "address": {
     "type": "object",
     "o:length": 1,
     "o:preferred_column_name": "DATA$address",
     "properties": {
      "city": {
       "type": "string",
       "o:length": 16,
       "o:preferred_column_name": "DATA$city"
      },
      "country": {
       "type": "string",
       "o:length": 8,
       "o:preferred_column_name": "DATA$country"
      }
     }
    }
   }
  },
  "additional_requests": {
   "type": "array",
   "o:length": 1,
   "o:preferred_column_name": "DATA$additional_requests",
   "items": {
    "properties": {
     "type": {
      "type": "string",
      "o:length": 16,
      "o:preferred_column_name": "DATA$type"
     },
     "details": {
      "type": "string",
      "o:length": 64,
      "o:preferred_column_name": "DATA$details"
     },
     "quantity": {
      "type": "number",
      "o:length": 1,
      "o:preferred_column_name": "DATA$quantity"
     }
    }
   }
  }
 }
}
>>>
Enter fullscreen mode Exit fullscreen mode

We can use this Data Guide for creating a table to make it very easy to run SQL over it, but this will be in a different chapter ;)

Top comments (0)