DEV Community

Cover image for Website Time dataset
victor_dalet
victor_dalet

Posted on

Website Time dataset

Hello, I found a dataset on kaggle in the time of use of a website, so I want to find a ratio between the number of pages visited and the total time in the website.

You can find the dataset and the code in my github : https://github.com/victordalet/Kaggle_analysis/tree/feat/website_traffic


I - Installation

To do this, I use sqlalchemy in python to convert my csv into a database and plotly to display my results.

pip install plotly
pip install sqlalchemy
Enter fullscreen mode Exit fullscreen mode

II - Code

I create a Main class, in which I retrieve my csv and put it in a database, using the get_data method.
The result is a list of tuples, so I create the transform_data method to obtain a double list.
Finally, I can display a simple graph between the number of pages viewed and the total time.

import pandas as pd
from sqlalchemy import create_engine, text
import plotly.express as px


class Main:
    def __init__(self):
        self.result = None
        self.connection = None

        self.engine = create_engine("sqlite:///my_database.db", echo=False)
        self.df = pd.read_csv("website_wata.csv")
        self.df.to_sql("website_data", self.engine, index=False, if_exists="append")
        self.get_data()
        self.transform_data()
        self.display_graph()


    def get_data(self):
        self.connection = self.engine.connect()
        query = text("SELECT Page_Views, Time_on_Page FROM website_data")
        self.result = self.connection.execute(query).fetchall()

    def transform_data(self):
        for i in range(len(self.result)):
            self.result[i] = list(self.result[i])


    def display_graph(self):
        fig = px.scatter(
            self.result, x=0, y=1, title=""
        )
        fig.show()


Main()
Enter fullscreen mode Exit fullscreen mode

III - Result

The x-axis indicates the number of pages visited by the user, while the y-axis shows the time spent on the website in minutes.

We can see that the users who stay the longest visit between 4 and 6 pages, and that between 11 and 15 pages all users stay at least a few minutes.

Image description

Image of Datadog

The Future of AI, LLMs, and Observability on Google Cloud

Datadog sat down with Google’s Director of AI to discuss the current and future states of AI, ML, and LLMs on Google Cloud. Discover 7 key insights for technical leaders, covering everything from upskilling teams to observability best practices

Learn More

Top comments (0)

Billboard image

The Next Generation Developer Platform

Coherence is the first Platform-as-a-Service you can control. Unlike "black-box" platforms that are opinionated about the infra you can deploy, Coherence is powered by CNC, the open-source IaC framework, which offers limitless customization.

Learn more