DEV Community

Cover image for Itheum WhitePaper
Itheum
Itheum

Posted on • Updated on

Itheum WhitePaper

Abstract

Itheum is the world's 1st decentralized data brokerage platform that transforms your personal data into a highly tradable asset class. It provides Data Creators and Data Consumers with the tools required to "bridge" highly valuable personal data from web2 into web3 and to then trade data with a seamless UX that’s built on top of blockchain technology and decentralized governance. We provide the end-to-end platform required for personal data to be made available in web3 for the first time in history and to enable many more wonderful and complex real-world use cases to enter the web3 ecosystem. Itheum provides the core, cross-chain web3 protocol required to enable personal data ownership, data sovereignty, and fair compensation for data usage - and this positions Itheum as the data platform for the Web3 and Metaverse Era.




Disclaimers

This Whitepaper and any other documents published in association with it including the related token sale terms and conditions (the Documents) relate to a potential token (Token) offering to persons (contributors) in respect of the intended development and use of the network by various participants. The Documents do not constitute an offer of securities or a promotion, invitation or solicitation for investment purposes. The Documents are not intended to be a financial services offering document or a prospectus. The token offering involves and relates to the development and use of experimental software and technologies that may not come to fruition or achieve the objectives specified in this Whitepaper. The purchase of Tokens represents a high risk to any contributor. Tokens do not represent equity, shares, units, royalties or rights to capital, profit or income in the network or software or in the Token issuing entity or any other entity or intellectual property associated with the network or any other public or private enterprise, corporation, foundation or other entity in any jurisdiction. The Token is not therefore intended to represent a security or similar legal interest.
The purchase of Tokens involves significant risks and prior to purchasing them, you should carefully assess and take into account the potential risks including those described in the Documents and on our website.
Although there may be speculation on the value of the Tokens, we disclaim any liability for the use of Tokens in this manner. A market in the Tokens may not emerge and there is no guarantee of liquidity in the trading of the Tokens nor that any markets will accept them for trading.
This Whitepaper describes a future project and contains forward-looking statements that are based on our beliefs and assumptions at the present time. The project envisaged in this Whitepaper is under development and being constantly updated and accordingly, if and when the project is completed, it may differ significantly from the project set out in this whitepaper. No representation or warranty is given as to the achievement or reasonableness of any plans, future projections or prospects and nothing in the Documents is or should be relied upon as a promise or representation as to the future.
The laws of various jurisdictions may apply to the Tokens. The regulatory treatment of the Tokens may change from time to time which may affect the offering and use of the Token and the development of the project described in this Whitepaper.


Introduction

Every day, billions of people give away their personal data to organizations in return for some service or product. They sign up to apps, websites, social networks, make online purchases, use digital banking for their day to day transactions, use wearables to monitor their health, and use countless other digital services run by commercial profit-seeking organizations who absorb their personal data and put it into locked-up data silos.

These organizations then use your personal data to learn more about you and people like you. They then create more products and services for you to buy and get hooked onto… they call this "stickiness"; and their objective is for you to keep coming back and to share more data.

Some organizations (usually the largest and most influential) can even use your data to influence your thinking and ideas or even resell your data to 3rd Party Data Brokers (independent commercial organizations). These 3rd Party Data Brokers lurk in the shadows and "deal in the personal data trade" making millions of dollars of profit by packaging and selling your data to other organizations. There are over 4000 of these Data Brokers in the world today and they have over 3000 data points collected on each one of us! The entire Data Brokerage industry is worth over USD 200 billion each year and the person generating this data (You!) have no idea how your data is used nor do you get any form of compensation.

At Itheum, we believe that YOUR DATA IS YOUR BUSINESS!


What is Itheum?

Itheum wants to change this current toxic model for personal data collection and exchange and level the playing field — where the commercial enterprise and “you” (the Data Creator) equally benefit from the trade of personal data.

Itheum empowers data ownership in the metaverse and brings new market value to your data. It enables this by providing “decentralized data brokerage” technology. It’s a suite of tools that enables high-value data to be bridged from web2 to web3 and then be traded via peer-to-peer sales. It allows for “viral adoption” via our creative NFMe ID (Non-Fungible Me ID) and Data NFT technology and our innovative Data Coalition DAOs (which bulk trade your data). It also aims to be fully privacy-preserving, regulation-friendly, and cross-chain; making it the most comprehensive core blockchain data infrastructure available in the market with use-cases in both the enterprise and consumer space.

It provides 3 fundamental products (that when used together) will flip the dynamic of personal data collection and exchange.


Our Core Products

Itheum has 3 Core Products that work in unison to enable new data to be generated and collected from the web2 world and for that data to then flow seamlessly into the web3 domain (via our Data CAT - Collection & Analytics Toolkit). This data can be claimed by the Data Creators (users who originally generated the data) and traded using our innovative peer-to-peer data trading technology (via our Data DEX).

For a platform like Itheum to reach its highest potential it also needs to have mass adoption (more users = more data = more data trade) - we aim to achieve this via our final core product (called the Data Metaverse) - where we provide "viral", "sexy" and "low barrier to entry" consumer products that will appeal to the masses and in-turn fuel mass adoption.

1) Data Collection & Analytics Toolkit (Data CAT)

Tools for structured and rich personal data collection and analytics.

Anyone can use Itheum's Data CAT to seamlessly build apps and programs that can collect structured and rich personal data from users and also provide visual trends and patterns on the collected data (usually using fully anonymous or semi-anonymous analytics to protect the “Data Creator” — i.e. the user who generated the original data)

With this product, Itheum provides real-world value and adoption — and in return, we generate highly structured and outcome-oriented personal datasets (i.e we normalize what user data looks like across organization silos)

This product is essentially a fully-featured personal data and analytics platform offered as a PaaS (platform as a service) to organizations. For example:

  1. A health and wellness company can use the Itheum Data CAT and embed it into their apps; enabling the collection of health data like blood pressure, fitness/activity, sleep quality, and then visualize trends and patterns of their app userbase.

  2. A financial organization can use the Itheum Data CAT to collect scheduled data via customized surveys or questionnaires, triggered by certain actions related to spending patterns — the trends and patterns can then give them more context of their customers’ spending habits.

Itheum already has multiple apps and programs using the Data CAT — you can read about these real-world case studies here

With this product offering; we are enabling the creation of highly-structured, outcome-oriented and normalized personal datasets (e.g. here is a sample of the data insights being collected by the OKPulse app which is built on Itheum). This data can then be bridged from web2 to web3 and unlocked by our core web3 data protocol - called the Data DEX.

2) Decentralized Data Exchange (Data DEX)

Suite of web3 tools for the seamless trade of personal data.

This product enables you (the Data Creator) to own and trade the personal data which was collected by the organizations who built apps on Itheum's Data CAT product. It unlocks the personal data from these organization silos and lets you trade your data on the open market with organizations and agencies that can derive value from these previously locked datasets.

So in those same examples given above:

  1. If you use that health and wellness company’s app; you get to own your data and then trade the health and wellness data you collect as part of your subscription to that app by using the Itheum Data DEX.

  2. If you use that financial organization product; you get to own your behavioral data and trade it as you choose with anyone else on the Itheum Data DEX.

As your personal data was collected as part of a real-world product or service, it’s highly-structured, outcome-oriented, and normalized — which means it’s highly valuable to many organizations across the world looking to build high-quality datasets to power their machine learning & analytics backed business insights engines.

This is the real value of Itheum and a key differentiator when compared to other blockchain-based data platforms. They struggle to gain any real momentum as their products cannot effectively be used in the real world. Itheum wants to focus on real-world adoption and our 3 core products are a balanced approach to adoption and decentralized self-sovereignty.

Using the Itheum Data DEX, you can trade your data via a peer-to-peer, direct sale method or use our Data Coalition technology and align to a decentralized entity (backed by a DAO) which will trade your data on your behalf (whilst acting on your best interests) and compensate you after the successful trade of the data. Data Coalitions are explained in more detail in a below section, but they are a bit like decentralized credit unions - View a view on Credit Union Philosophy— who by representing many members (DAO stakers) have the collective bargaining power on who can procure your data and for what it can be used for (i.e. they trade on behalf of their members' interests)

The Data DEX is extremely powerful; it also allows you to secure sensitive personal data in secure data vaults, it allows you to wrap your data as an NFT so you can earn royalties on any re-sale of your data (all these revolutionary features are provided in detail in below sections). It also aims to be fully multi-chain compliant (EVM and non-EVM blockchains).

The Data DEX can also be thought of as the core data protocol layer that enables decentralized data ownership and trade. This protocol can be abstracted with more products built on top of it; the Data Metaverse being an example of such a consumer-friendly, "viral" product suite.

3) Data Metaverse

Suite of metaverse, gaming and NFT aligned consumer products that appeal to the masses with a goal to enable mass adoption.

The concepts behind data trading and blockchain-backed data technology can be overwhelming for most people to grasp. This is a key reason why many of the current web3 data platforms have very low adoption and hype. For platforms such as Itheum to change the dynamic of data ownership we need large-scale bottom-up adoption - and this kind of adoption can only come if we build consumer products that "everyone wants to use".

In web3; we have observed that this viral adoption is most present in the gaming industry where there are large groups of people from varying socioeconomic and demographic backgrounds who are readily willing to participate in new gaming platforms and adopt new technology. The growth in play-to-earn games like "Axie Infinity" and metaverse games like "The Sandbox" are evidence of this (with new platforms coming to market weekly and still having rapid adoption and growth)

The challenge for Itheum was to take "serious and complex ideas around data ownership" and abstract them with "sexy and easy-to-use consumer products". We knew that if we achieve this we can drive large adoption for our platform and in turn normalize the complex concepts around data ownership and trade. In turn; Itheum gets a large user base to test our core protocol layer and to increase the volume of data available for decentralized data trade.

The key products that form our Data Metaverse ecosystem are:

  1. NFMe ID (Non-Fungible Me ID)
    Every person on earth can be thought of as a "non-fungible human", as we have data that is unique to each one of us (like our DNA or even our fingerprints - which are non-fungible). NFMe ID is a digital avatar identity that is built on dynamic NFT technology and backed by real personal data. Everyone in the world can mint an NFMe ID on Itheum's platform and "truly own" it. As more data is collected by the "NFMe ID host" and pushed into the NFMe ID; more "accessories" are minted and linked to the NFMe ID. These accessories are NFTs themselves and can be freely traded. NFMe ID avatars are the next generation "Profile Picture NFT" and/or gaming avatars and the goal is to have it evolve into being a universal avatar for each human on earth. They will be portable in the real-world and interoperable in all web2 and web3 metaverse platforms (e.g Meta, The SandBox, Axie Infinity, Decentraland); backed by real value (in the form of personal data), and powered by true ownership (NFT technology)

  2. Data NFTs
    The NFMe ID host can choose to take subsets of their base data and mint them into separate Data NFTs. These Data NFTs are linked to the base NFMe ID but can be independently traded in our own Data NFT marketplace or secondary marketplaces (e.g. OpenSea). As an example; the user can add more data to their NFMe ID by joining an app like the Red Heart Challenge (an app built on Itheum's Data CAT), as the user participates in the data collection as part of this app; data will flow into the NFMe ID. The user can then choose to mint a Data NFT that only holds the subset of data generated via the Red Heart Challenge. This Data NFT can then be traded to a commercial enterprise or researcher who may be keen to access the core dataset. Ownership of a Data NFT provides access to the underlying data (so long as the "terms of use" are followed)

  3. Data Coalition DAOs
    A more detailed explanation of Data Coalitions DAOs are provided in a below section, but within the context of the Data Metaverse - you can think of Data Coalition DAOs as enterprises or groups who are keen to trade bulk datasets on behalf of a large group of individual users. As personal data is much more valuable when traded in bulk, it makes sense to align to a Data Coalition DAO and have your data grouped into a large dataset. In the Data Metaverse - a Data Coalition may occupy some "land" and setup up their "virtual DAO premise" and your NFMe ID will be able to visit this premise and choose to align to the Data Coalition DAO which in turn will make sure the user's best interests are taken care of during any trade of data that happens as part of this interaction.

  4. Greenroom Protocol
    We envision a future where metaverse interoperability can be achieved via open-standard digital avatars and your NFMe ID avatars aim to be these open-standard avatars. Interoperability between metaverse platforms is a very early-stage idea but the primary use case is to have the "users or players" move their digital assets and avatars between independent metaverse platforms. As an example; a player on The Sandbox can move their avatars and assets to Decentraland or possibly even into centralized metaverse platforms like Meta and participate in the native gameplay and ecosystem exploration. The Greenroom protocol is Itheum's initiative for open-standard metaverse interoperability. It will consist of two parts:

  • A set of rules and standards to define NFT metadata that can then be implemented across metaverse platforms. The metadata can indicate the data for the visual characteristics of a character and (possibly) the standards for animation rigging to ensure the character appears as a native character in any metaverse platform.
  • A metaverse interoperability "virtual waiting room" called the Greenroom. The Greenroom is where your NFMe ID will exist (visually in 3D) and you will be able to see and organize your "accessories" and other NFMe ID characters and trade accessories whilst interacting with them. The Greenroom is where the "Itheum Data metaverse" will integrate with other metaverse platforms like The Sandbox. Once open metaverse integrations are built with other 3rd party platforms, they will appear as a portal inside your Greenroom and you will be able to enter it and teleport your NFMe ID avatar into other metaverse ecosystems. In show business; a Greenroom is a waiting room where performers "hang out" until they are called on stage - similarly; Itheum's metaverse Greenroom is where your NFMe ID avatar can hang out until it can teleport to other metaverse platforms.

Solution Overview Diagram

The below high-level solution overview diagram details how all our key components and features align to enable Itheum's comprehensive data technology solution for the web3 and metaverse ecosystem.

Image description

Click here to view the larger version


Important disclaimer on usage of certain words in following sections

It must be duly noted that we use terms such as "Buy", "Buying", "Purchase", "Pay", "Sale", "Selling" and "Sell" purely for simplification of explanation in the following sections; in reality, a "sale" or "purchase" refers to an "access rights request transaction" between two parties and not to any form of monetary transaction. Itheum is a system that allows for data access to be granted via peer-to-peer handshakes coordinated by our $ITHEUM utility token. Please read our Token Utility section for more on this. It must also be duly noted that we use the $ symbol as a prefix to the $ITHEUM utility token as a means to simplify our written communication in this whitepaper and for the reader to be able to differentiate the $ITHEUM token from the "Itheum platform" - the $ is in no way, form or shape an indication that it's a form of currency, it is purely a "software-based utility key" that is needed to unlock access to data stored within the Itheum platform.


Decentralized Data Trade

The Data DEX allows for the seamless trade of personal data by “Data Creators”. The sale, verification, and ownership accreditation of the data are handled on-chain but the actual data being traded is kept off-chain.

There are a few reasons for this:

  1. On-chain storage of large datasets is not feasible. Blockchains are not built for the storage of data and doing this will be costly and also pollute the blockchain ledger.

  2. On-chain storage of entire data or segments of data can also lead to privacy and data sovereignty issues as the blockchain is a fully open, distributed, and transparent tool.

For these reasons, the trade of data is facilitated via a hybrid on-chain/off-chain model.

  1. Data is first validated and hashed. The validated and encrypted dataset is uploaded to a centralized, secure storage location to a hidden (non-public) destination.

  2. It’s important to note that the above “storage location” is centralized. rather than being decentralized via IPFS or FileCoin (for example). Again there are reasons for this — this mainly has to do with the data sovereignty laws of data where certain countries or regions require data to remain within geographical boundaries. We are working on a solution that allows for decentralized, region-based storage or via validated decentralized node-based storage. This will be abstracted via both the Data Coalitions and Regional Decentralized Hubs features - where privacy, security, and regulatory requirements for storage of data are governed via a delegated, authorized Data Coalition Board who will manage these on behalf of the end-user.

  3. The hashed value and the identifier for the location are stored on the blockchain. The availability of this new data for trade is then “Advertised” — allowing for this new dataset to be “discovered” and then purchased via on-chain transaction facilitation.

At this point, there are “two groups” of data you can trade on the DEX. They are as follows:

1) Trading Core Itheum Data

The core Itheum platform is a general-purpose personal data collection platform where organizations can build applications (which require structured user data) using Itheum's Data CAT.

Examples of these structured data collection applications are:

The applications are called “programs” within the Itheum platform and these programs can onboard their own “end-users". These end-users generate a lot of personal data and have access to an end-user portal app which provides full visibility over the data and insights being collected.

This demo video shows you an example of the end-user portal for OKPulse (https://www.youtube.com/watch?v=ITONnseBFV4)

The end-user portal allows the user (Data Creator) to link their Elrond, Ethereum, Polygon, BSC, Avalanche (or other supported Blockchain) account to their Itheum platform account. Once they have done this they can then use the Data DEX to load the raw datasets they have generated and advertise them for trade.

As the data collected via an Itheum application is fully structured and outcome-oriented, for example — Red Heart Challenge data is centered around the self-management of cardiovascular disease and OKPulse is around the proactive monitoring of employee health and wellness issues — it is very valuable for a Data Consumer who wants to align analytics discovery or outcome analysis around a certain topic.

When this data is grouped with multiple people who have joined the same application via a “Data Coalition DAO” and augmented with personal data via the “Data Vault” then the value of the data can grow exponentially as the relevance, quantity, and quality of the datasets grow.

2) Trading Any Arbitrary Data

The Data DEX also allows you to trade any arbitrary data using the same on-chain facilitation process. This allows anyone with a crypto wallet to upload and trade datasets via the Data DEX. This reduces the barrier of entry for end-users and also provides them an equal opportunity to participate in the shared data economy. At this stage, we allow for the trade of the following arbitrary datasets.

2.1) Facebook Profile Data

As you may know, Facebook opened up the option for individuals to download all their data in bulk. This was a feature that was released by Facebook after pressure from data advocacy regulators who wanted to ensure individuals had the right to download their data from the Facebook platform should they ever want to delete their Facebook account or move to a new social media platform.

The Itheum Data DEX will allow for the above scenario (of moving to a new social media platform) to be handled seamlessly. For example, an individual can download all their Facebook data and advertise it for trade on the Data DEX, they can then align with a Data Coalition centered around the responsible usage of data by Social Media organizations. A new social media platform can then view the “Facebook user datasets” managed for trade by the Data Coalition, agree to the responsible terms of use, and procure bulk access to the data. The new social media platform uses the Data DEX to bootstrap and migrate Facebook users to its platform directly via an authorized, delegated owner (Data Coalition) of the end-user data. The end-user (original Facebook user) then gets the relative micropayments from the Data Coalition and can also view the complete audit trail of data transfer between the social media platforms.

2.2) Any Other General Data

The Data DEX also allows for the on-chain trade of any other type of data. For example, you can create a dataset of all the brands and products you, your family and friends like, all the reviews and ratings of your favorite restaurants, personal fitness, or other wellness data. You can also create and trade other interesting IP-centric general datasets like utterances to intent mappings which can then be used by organizations to train NLU and speech to text applications. For example, a hospitality booking organization (which provides telephony services or conversational tools like chatbots to help users make a booking) might already be spending a lot of investment in continuing to study end-user speech-to-text patterns and map them to intents that are relevant to their industry. They can sell the utterance-to-intent mapping data on the Data DEX to recoup some of their spending in producing this IP.


Types of Direct Trade

As described in the above section - you can use the Itheum Data DEX to trade personal data using on-chain tools that coordinate the trade between buyer and seller. On the Data DEX, there are essentially two types of direct trade that can occur. We mention “direct trade” here as it’s the seller (Data Creator) who decides which type they prefer and initiates the trade process within those constraints. In a later section we will describe how data can also be sold indirectly via a delegated Data Coalition, but for now, let’s focus on the following two direct trade types.

  1. Trade of Data Packs
  2. Trade of Data NFTs

let’s now dive into these two types individually and understand the difference.

Trade of Data Packs

Once a Data Creator decides to trade their personal data, the default method of trade is via Data Packs. Data Packs hold a reference to the type of dataset they are advertising and contains some metadata around it. For example, it provides a preview of the data, when it was created, terms of use, the link to where the data can be securely downloaded from (after trade is complete), etc. Once the user creates a Data Pack it gets advertised for sale ‘on-chain’. Once this on-chain advertising process is completed, the original “data hash” and the “transaction reference” of the advertising process are stored as metadata against the Data Pack. A $ITHEUM price is automatically calculated based on the type of data and market demand for the type of data. In the future, we will also support for the Data Creator to set a starting price for their data. The sale of Data Packs can be described as direct peer-to-peer trade.

Once the on-chain advertising process is complete, buyers will see the advertised Data Packs in the “Buy Data” section of the Data DEX. The buyer can pay the $ITHEUM cost for the data and then “own” access to the underlying data (these copies are called Data Orders and appear in the Purchased Data section of the Data DEX). As part of the buying process, they agree (on-chain) to abide by the “terms of use”. A record of this agreement is stored on-chain as part of the transaction and serves as an immutable audit trail for this agreement. One key point to note is that in this type of peer-to-peer trade — the re-sale of data is NOT permitted.

A Data Creator should choose to trade data as a Data Pack if they have the following requirements:

  • Sell multiple (potentially unlimited) copies of their data to buyers (Data Consumers) around the world
  • Don’t want their data to be re-sold by the buyer (i.e. the buyer can only use it as per the terms of sale and for their own use/consumption — if a buyer breaks their agreement and resells the data, the owner will have the ability to detect this and mediate a conflict resolution process — but only if the trade is handled by the Data Coalition — more on this below — but the Data Creator, unfortunately, won’t have a direct method to track the re-sale or get any benefit from it. e.g. royalties).
  • Only allow for their personal data to be traded on the Itheum Data DEX (no other decentralized marketplace will display the Data Pack for sale — this is because it’s not built on any open blockchain standard like ERC20 or ERC721 that allows for interoperability). This can be considered a benefit if the Data Creator wants to limit exposure of their data (as it’s available for trade, in only one data marketplace)

Trade of Data NFTs

A Data Creator can also choose to trade their personal data as an NFT (Non-Fungible Token). This makes a lot of sense as personal data is very unique and NOT fungible (watch this short video to understand the difference between fungible and non-fungible). This allows for the Data Creator to have more control of their data and can align to NFT features that make their data grow in value (rarity/scarcity) and increase the exposure of their data assets due to the interoperability and portability NFT standards have in the blockchain ecosystem.

Data NFTs are described in more detail below and use the ERC721 open standard to coordinate and facilitate the NFT transaction between parties involved in the trade.

A Data Creator should choose to trade data as a Data NFT if they have the following requirements:

  • Trade a “limited number” of copies of data: This is the same concept as the limited edition NFTs we see in the market today. Having a limited number of copies of the data will create organic growth demand for the data as the buyer will realize the rarity and scarcity of the data. For example, if a Data Creator “only mints 2 copies” of their DNA sequencing result — this has a lot of value to a buyer trying to build a dataset of similar DNA sequencing results as there are only 2 copies available. If there were virtually unlimited copies available (as you would have with Data Pack-based trade) — then the perceived inherent value for the data will be less.
  • Allow for the “re-sale” of data: The Data Creator is happy to allow for the re-sale of the NFT packaged data by a buyer. This opens up a lot of opportunities for the data to grow in value as the buyer might have a better ability to “market the data”. This also opens up opportunities for secondary markets where “verified, legitimate data brokers can exist on-chain” — a revolutionary concept and a futuristic solution to the problem that exists today where centralized data brokers are selling your personal data without your knowledge.
  • Benefit from the “re-sale” of data via Royalties: Your data is essentially your IP (much like a song by a music artist or a book by an author) and packaging your data and selling it as an NFT will allow you to earn a royalty on the resale of the data. For example, you can choose to nominate a 10% royalty condition and if a buyer re-sells the data, 10% of the sale will be transferred to your account.
  • Benefit from multiple NFT marketplaces: As Itheum Data NFTs are built on the ERC721 standard, they immediately have interoperability with all the NFT marketplaces that support this standard (OpenSea, Rarible, etc). This significantly increases the audience and therefore increases your potential to sell.

Demo 1: minting and selling Genomics data as a Data NFT

Demo 2: minting and selling Blood tests results as a Data NFT on OpenSea


Methods for Buying Data

One-off datasets advertised for trade on the Data DEX are called “Data Packs”. As described in the above section “Decentralized Data Trade”, there are various types of data that can be put up for trade on the Data DEX. Along with one-off data sets (Data Packs), the Itheum Data DEX also allows for the trade to “Data Stream subscriptions” (see in a below section titled Data Streams)

Data is traded via two channels:

  1. Direct between Data Creator and Buyer (Data Consumer)
  2. Via an intermediary, i.e.g. an authorised Data Coalition

Anyone with a crypto wallet can procure access to Data Packs or Data Streams under certain conditions:

  1. They need to have $ITHEUM tokens to pay for the data. The $ITHEUM token is sent directly to the Data Creator (if the sale is directly between Data Creator and buyer) or to a Data Coalition if the Coalition is “brokering the trade”.

  2. Each Data Pack or Stream will have an associated “terms of use”, the buyer agrees to abide by these nominated terms. There will be dispute and conflict resolution processes in the future to protect the seller from misuse. (if the sale is directly between Data Creator and buyer)

  3. If the purchase is via an authorized Data Coalition, then the buyer needs to adhere to the terms and conditions of use as per the Data Coalition and also put in collateral in the form of $ITHEUM tokens for a certain period (until the buyer earns a higher credibility score) – data traded via a Data Coalition has more robust misuse remediation and dispute resolution process handled via decentralized governance.


Our 5-stage Product Development Process

To ensure we continually innovate and deliver tangible web3 / blockchain features to market - we will use a simple 5-stage product development process. All our features will continually be categorized as per these stages to ensure progress is transparent to our community.

  1. Research : Labs - Ideas that are running through an R&D stage.
  2. Detailed Design & Prototyping - Ideas that pass our labs stage and for which we are doing detailed solution architecture and prototyping. Release candidate notes are also prepared for formal development to commence.
  3. Available in Testnet - Iterative builds being released to testnet; tested and signed off by our QA team and community testers. This will continue until a mature build passes QA and is ready for deployment to mainnet.
  4. Security Audits - Final release that goes through internal and external security audits.
  5. Available in Mainnet - Released to mainnet.

Data DEX Components

The following key components form our Data DEX product, they work to provide the core data protocol required to make Itheum a fully-featured decentralized data platform.


Peer to Peer Data Trade

The base functionality that's available when you first log into the Data DEX is the ability to discover and trade data with your "direct peers". You will be able to see all "advertised data" from others and also "advertise" your own data in the data marketplace. As detailed in the section (Decentralized Data Trade)[#data-trade], you can advertise your data from apps that you have joined that are built on Itheum's Data CAT or you can also advertise any "arbitrary data" that you own.

When you place data for sale you will nominate a purchase price in $ITHEUM tokens, which is effectively the price for a Data Consumer to access your data. If someone wants access to your data, they will transfer the $ITHEUM tokens requested and in return get access to the data for use based on the "Terms of Use". The market operates completely in a "peer-to-peer" manner where there are no intermediaries involved in the transaction. Data Creators and Data Consumers deal direct with each other and the entire process is mediated using Smart Contract technology.

Data NFTs are currently in the "Available in Testnet" stage.


Data NFTs

Data is an asset in itself and personal data is a “unique asset” as no two personal datasets are the same. Highly personal or sensitive datasets can essentially function as an NFT allowing for uniqueness and limited availability.

For example, your might want to share your DNA or partial sequencing results under the “research” terms and conditions — but you may want to limit how many buyers can purchase it and use it (controlling distribution)— Data NFTs allow you to do this.

To make it more aesthetic to trade, we convert valuable datasets (usually this will be in JSON or data in any another interoperable format) to a unique visual representation of that data (which will be in a deterministic, random image abstract format) — to do this, we use a generative art algorithm for image generation that's based on the unique signature of the data.

Packaging and trading your personal data as NFTs have the following benefits (when compared to regular one-off trade using the DEX):

  1. Limit the distribution of your highly sensitive or protected datasets to a smaller amount (similar to limited edition NFTs we see in the market today)

  2. You can choose to earn royalties if your data is resold. You can limit the distribution of your data but should a buyer resell your data NFT, you have the option to earn a % as royalty. This is a game-changer and can prove to be a steady secondary income stream for you. This is especially true if your data is bulk curated into a Data Coalition - which will have a high buyer demand.

  3. Once data is packaged and tokenized as an NFT, it can be traded on any NFT marketplace (e.g. Opensea - we already support this interface in our testnet). This significantly increases the selling power of your data as the audience for your data increases. Read about it here on our blogpost Selling your Itheum Data NFT on OpenSea

  4. Your unique data will be minted with an aesthetic “generative art wrapper” created using the unique signature of the data. E.g. your DNA data can be represented as a piece of unique art (similar to Autoglyphs or other generative art) which will have inherent value as it has rarity, creativity, and actual utility packed into it.

What real-world trade characteristics do NFT wrapped data assets provide?

Let's look at some key trade characteristics we can get by wrapping our data as an NFT and then opening it for trade.

When working with NFTs in general, the main actors to consider are:

  • Issuing Entity: The issuing entity behind the NFT (in Itheum's case; the Original Owner or a Data Coalition can be the Issuing Entity)
  • Original Owner: The original Data Creator who choose to trade their data as a Data NFT
  • NFT Holder - The present holder of the Data NFT
  • NFT Purchaser - Someone with the intent of acquiring the Data NFT from the present NFT Holder

image

In this example, let's assume you want to trade your Genomics dataset as a Data NFT.

  1. You use the Itheum Data DEX to upload your data file and mint a new Data NFT. You are the Original Owner / Issuing Entity. The Data NFT is minted with a "proof of ownership" along with other metadata that enables access to the original data file if ownership is proven. All this metadata will be stored on-chain via the standard NFT metadata file schema standards.

  2. You then head over to a public NFT marketplace (e.g. OpenSea - where your Data NFT will already be available in your wallet). You place it for sale for .05 ETH

  3. Buyer 1 comes along and purchases the Data NFT as they see value in owning a genomics/DNA-based dataset. Transfer of ownership now moves from you to Buyer 1. They are now the NFT Holder

  4. But Buyer 1 does not intend to use the data for any utility, they are essentially a pure "data trader" and intend to resell the Data NFT at a higher price. They increase the sale price to 1 ETC.

  5. Buyer 2 comes along and wants to purchase ownership of the Data NFT as they intend to use it for their research into topics relating to Genomics. They are now the NFT Purchaser. As their data research requires absolute truth-fullness in the data quality (e.g. they may intend to use it to train an ML model for disease diagnosis and need to guarantee that the training data has not been tampered with) - they choose to procure this data via decentralized blockchain-based trade instead of buying it from centralized data brokers or data sellers.

Let's look at the list of benefits each actor received by trading data wrapped as an NFT

Original Owner / Issuing Entity:

  1. Limit the supply of the dataset to just a single item (they can always mint more if they see demand grow)
  2. During initial sale on the open NFT Marketplace (e.g. OpenSea), assign a royalty % that they get paid during all future re-sales
  3. Benefit from all the open NFT Marketplace features to control the sale (duration / minimum period, highest bid, etc)

NFT Holder:

  1. Ability to speculate on the future price of non-fungible data and earn profits for re-sale
  2. Build Data NFT collections based on market/seasonal demand for certain types of datasets

NFT Purchaser

  1. This actor gets the most value as they intend to use the contents of the Data NFT (raw dataset) for actual utility.
  2. They can view and prove the provenance of the NFT using on-chain lookup and identify the Original Owner
  3. They can view and prove the lineage of the NFT using on-chain lookup (Original Owner -> Buyer 1 -> Buyer 2)
  4. They can view and prove the veracity (truthfulness/accuracy) of the NFT using an on-chain lookup of the metadata and data proofs.
  5. The can request formal transfer of ownership to own the IP if needed and is allowed in the original terms of sale. Although this feature is not an inherent quality of NFTs, it will be mediated via Data Coalitions and our "decentralized middleware" service

Data NFTs are currently in the "Available in Testnet" stage.


Data Coalitions

Independently trading personal data is inefficient and time-consuming. Continuing to curate and monitor the “terms and conditions” for each sale as well as to keep track of what data will be used for and by whom will quickly become overwhelming.

Your personal data (both the longitudinal data from your structured programs and highly personal & sensitive data from your Data Vault) — is also not very valuable “when viewed in isolation” — but when your data is “grouped” into clusters of similar people, it grows significantly in value as the volume and quality increases (e.g. your health data is worth > $1,500 if sold as part of a larger dataset). The grouped data then becomes useful for deep analysis or to train machine learning models for example. We believe that this is the future of how data will be sourced on the blockchain to train AI and for deep analytics.

Data Coalitions are DAOs where the "Creators" of the Data Coalition will bond $ITHEUM tokens to form and run it. The creators are called "Board Members" and they have an incentive to run the Coalition in the best interests of its "Members". Board Members have a "stake in the game" with their bond and therefore will need to always act in the best interest of the Members. Board members will also earn a share of the trade, so it's in their best interest to keep their Data Coalition as robust as possible to attract new Members (and therefore more Data).

Itheum envisions a future where the most successful Data Coalitions will be run by enterprises and SMEs (subject matter experts like legal and regulation experts, commercial data warehouses, academic/research institutes, government departments, etc) and will be the perfect balance between the commercialization of data and accountability to end-users (Data Providers). Board Members vote to agree on the terms and conditions and the governance (privacy and security) of the data trade, the parameters they agree to will be made visible to anyone who wants to join the Data Coalition.

Users (called Members) can then align to the Data Coalition who they feel acts in their best interests. You then delegate the ownership of your Data Packs, Data Streams, and Data NFTs to your preferred Data Coalitions. The Data Coalition will group your data into clusters of similar data and advertise the bulk datasets to interested Data Consumers, who can then procure access to the bulk datasets. In return, you can earn a steady value return on your data or choose to lock up your returns for the longer-term growth of the Data Coalitions that you support.

Data Coalitions also allow for "staking" of $ITHEUM tokens, where anyone can stake their $ITHEUM tokens with a Data Coalition (you don't HAVE to provide your data) to flag their support for the Data Coalition and to signal that the data is good (Crowd-Sourced Data Curation). This allows for the Itheum platform to be used by users who want to participate in the personal data economy but who don't necessarily want to provide their data for trade. All parties involved in the Data Coalition (Board Members who bond $ITHEUM tokens, Members who share data, Members who stake $ITHEUM tokens are all incentivized relative to their role and stake and earn micropayments after each trade is finalized)

Itheum's Data Coalitions are modeled on the Credit Union Philosophy

Decentralized Board Members

As introduced above; Data Coalitions are formed and run by a virtual board — they have additional governance responsibility and can mediate/provide conflict resolution, negotiate terms of sale of data with real-world entities and other Data Coalitions, etc. Board Members have to bond $ITHEUM tokens into the Data Coalition to ensure they have a “stake in the game”, after which they can stand for election and be voted in by other board members.

To prevent hard centralization, Board Members will serve a fixed term (if required by the Members - i.e. it's not mandatory), after which they will need to rotate out and be replaced with a new board. Board members earn a share of the sale in data (paid out in $ITHEUM tokens) that is housed within the Data Coalition. They can also lose $ITHEUM tokens in the event they don’t represent their member’s best interests or conduct an incorrect sale of data (i.e. a trade that breaks the terms of sale contract) which then needs to be revoked and damage compensation needs to be paid to both the sellers and buyers.

Although not mandatory, Members will be able to participate in ongoing period votes to express their satisfaction with the Board's performance. If satisfaction rates are low for multiple voting points - this will trigger a board rotation clause.

Other Notable Properties of Data Coalitions

  • Data Coalitions enable "collective bargaining power" for end-users and will be a viable solution to the problem of centralized enterprise data silos that don't provide any value to the Data Creator.
  • DAO-based governance and decision-making will be used to manage the ongoing operations of the Data Coalition.
  • They will also be delegated custodians of “Data Vaults” and can autonomously trade high-value data from the Vaults by attaching it to the other datasets within the Data Coalition.
  • They will also be able to link with the "Trusted Computation Framework" and facilitate the privacy-preserving compute-to-data technology handshake, where 3rd parties will be able to run algorithms on the datasets housed within the Coalition without having the identity and privacy of the original Data Creators leaked.
  • They can efficiently facilitate "micropayments" to all its members in return for data. For example, a Data Coalition can have 1000 "members" who contribute data for bulk sales. After each sale, the 1000 members will be sent a share of the earnings via micropayments. Traditional banking payment systems are unable to handle these kinds of micropayments due to the overhead of fees - but blockchain payment mechanisms will be able to facilitate this well.
  • Data Coalitions in the future will also trade with other Data Coalitions and be connected to autonomous machines via a machine-to-machine type interface. E.g. wearables or EVs who join Coalitions directly and participate in the data economy.
  • They can allow (if voted by members) for "anonymous cohort analysis" of data trends via tooling provided via our "Data CAT" feature. For example; there may be a Data Coalition set up for the collection of "fitness and demographic data" - where you, as a Data Creator can align to and provide your wearable data from Fitbit or Garmin as well as append specific demographic data from your Data Vault (e.g. Gender, Age, Ethnicity) to enhance its value. Anonymous cohort analysis can then be enabled to visualize the type of data under the control of this Data Coalition. This adds more appeal to buyers who can preview data with more detail before committing to buying at a premium price.

Image description

Anonymous cohort analysis via our Data CAT integration

  • By default, the Itheum Data DEX supports any data uploaded in valid JSON format. But there may be some specific data sub-standards that will be more appealing to certain types of niche buyers. For example, buyers who are interested in Health and Genomics data for automated ingestions into their systems - will prefer a more globally interoperable standard like FHIR - Fast Healthcare Interoperability Resources Standard. Data Coalitions will be able to mandate this as a minimum requirement for its members and ensure that the data being contributed is in the FHIR JSON standard.
  • Anyone can start a Data Coalition but it will take some effort to progress it into an "operational mode" and attract new data to come under its control. E.g. if you start a new Data Coalition you will need to bring in Board Members with credibility and who will need to bond $ITHEUM tokens for their term duration. Once you have filled the minimum requirements for the Board Members, the Data Coalition then enters "operational mode" and can begin accepting data and $ITHEUM token stakes from regular users (Members). But being in "operational mode" is not sufficient to attract the best quality data; all details about Data Coalition Board Members are made public - so you must have some commercial experience in data-related matters to give you credibility. Any "slashes or disputes" arising from your Data Coalition's trade activity are also made public. This is very similar to how the Delegated Proof of Stake validator selection process works, where you can stake your tokens after doing some due diligence on the validator's reputation and past performance. So for a Data Coalition to be successful - it will need to be in an "operational mode" and have some credible "Board Members" whilst maintaining ongoing trade operational credibility.

This feature is currently in the "Detailed Design & Prototyping" stage.


Personal Data Proofs

In the decentralized DApps ecosystem (DeFi, DAOs, or any other application use cases enabled via DApps); Smart Contracts enable for agreements between parties to execute based on "indisputable truths". For example, in a DeFi exchange, a trade transaction between two parties can happen backed by the on-chain state data that confirms that the transaction can indeed proceed (e.g. party A has the tokens to transfer and party B is entitled to receive the tokens based on some pre-agreed above condition). Smart contracts enable trust-less interaction to happen between multiple counter-parties. Traditionally, transactions such as these in finance require a trusted intermediary such as a bank to coordinate.

But Smart Contracts do have a key limitation in that the data they have access to on-chain to facilitate such trust-less behavior is very limited. We only have data around transaction history and other such on-chain information (e.g. voting outcomes etc). For more complex trust-less applications to migrate to the blockchain, we need "richer, real-world data" to flow into smart contracts. ChainLink provides this real-world data via their Decentralized Oracle Networks and enables the technology of "hybrid smart contracts". These hybrid smart contracts use real-world oracle data to make decisions and carry out transactions between parties. ChainLink enables smart contracts to tap into real-world data like exchange price feeds, weather, or sensor data and allows for the contracts to mediate transactions between multiple parties based on outcomes resulting from smart contract code execution backed by real-world data.

But if blockchain technology was to branch out and handle many more mainstream use-cases, we will need a wider variety of real-world data to flow into the smart contract world. One type of data, in particular, that is not available today (via Chainlink or any other Oracle-based networks) but is crucial for richer blockchain use-cases - is Personal Data.

As described in the above section titled "Types of Direct Trade" - the Itheum platform enables Data Creators to place their personal data for direct peer-to-peer on-chain trade. One of the underlying qualities of this type of trade is that a "proof of the personal data" is stored on-chain. This proof is then used (by a Data Consumer) to verify that the data's veracity is untampered before the trade happens.

These features enabled by the Itheum Data DEX can also be used as a "Personal Data Proof" by smart contracts to execute specific rules that enable transactions on-chain to happen that are backed by personal data.

This revolutionary concept is best explained with an example:

Home Loan Application - Real World Scenario
Jack wants to buy a house. He heads over to his bank and requests a bank loan. The bank needs to carry out a detailed assessment to make sure Jack is eligible for the loan (i.e. can he make repayments? what is the risk he will default on it?). The bank does a credit check on Jack and finds that his credit history is good. The bank's home loan broker then does an extended personal due-diligence assessment. Jack is asked to provide his financial history, income history, and other details about his spending habits, family, dependents, and work history. This information is collected using a detailed bank loan application form the bank has prepared as a standard document. Once Jack fills the form he needs to attest the form as holding "true" information (via a statutory declaration that is legally binding). The bank assesses the details on the form and makes a decision to give Jack the loan. Jack uses the loan and buys the house.

Now let's imagine that we want to port this entire scenario to the blockchain and remove as many intermediaries as we can (the bank, the credit history provider, the home loan broker, etc.)

Home Loan Application - DeFi Scenario
Jack wants to buy a house. He visits a DeFi Lending DApp that is a DAO (modeled after a real-world credit union but fully decentralized) that allows for borrowing based on voting-based approval and on-chain evidence of collateral (e.g. other assets that Jack owns in the same DeFi DApp or other DApps). Jack requests a loan and the DApp begins its automated due diligence process. As part of this process; the DApp invokes another DeFi DApp's smart contracts that allow for deep credit history checks (possibly via deep index lookups or via the Graph for more advanced lookups). The credit checks come back positive for Jack. The DeFi lending DApp then refers Jack to an application form built and run using Itheum's Data CAT, which includes all the due-diligence questions asked by banks during loan applications that need to be answered by an applicant directly (e.g. what is your employment history? your spending habits? your family details and other financial responsibilities? basically, all the personal information that is needed to make an informed decision about a borrowing risk but cannot be obtained using blockchain lookups). Jack completes the form and his responses are stored inside the Itheum Data DEX as a "Data Pack". Jack then Advertises this Data Pack on-chain and the proof of his response is published on-chain and sent to the DeFi Lending DApp as a "Personal Data Proof" as an attestation to his responses to the form. The DeFi Lending DApp now has all the information it has to take his application to the final DAO voting panel. The members of the DAO have all the attested information and proof to make a decision on the home loan (collateral confirmation, positive credit history, application form due-diligence responses, and on-chain proof). The DeFi Lending DApp is happy with the application and approves the loan and Jack receives the money.

Itheum provides the complete platform for these kinds of Personal Data Proofs that can be used for on-chain decision-making. Think of Itheum as the next layer of data inflow into the blockchain world. Core blockchains provide transaction data, ChainLink provides real-world event data and Itheum provides Personal Data Proofs direct from the end-user. When used in unison; we can replicate many real-world scenarios using smart contracts and remove redundant intermediaries.

Watch more real-world use cases and code demos


Data Vault

You can store highly sensitive personal data in your data vault. For example; details about your gender, race, sexual preference, prior health conditions, financial history etc.

This sensitive data is encrypted using your own encryption keys (no one else can unlock and view it) and stored in decentralized, redundant storage (no one else can destroy it)

You can then choose to append data from your vault to the regular data you trade on the Data DEX. As this gives the “dataset” more context, it becomes more valuable to the buyer — and you will earn more $ITHEUM tokens.

As the data is encrypted using the user’s private key we need to enable a frictionless UX during trade between buyer and seller where keys need to change hands with minimum manual involvement between parties; For this purpose, “symmetric key pools” (decentralized middleware) are used to enforce secure authorization between seller and buyer in real-time. Symmetric key pools will operate using a modified proof-of-authority mechanism to enforce the highest security with balanced decentralization (Please note that the specific technical design is still being finalized).

Other Notable Properties of Data Vaults

  • Their design will enable true data sovereignty via a "proof of ownership" based design architecture. Detailed technical design of data vaults will be released in our "Yellow Paper" but at a high level - all datasets that include sensitive data will be encrypted with keys that belong solely to the Data Creator. If a copy of the data is given to a new buyer, decentralized middleware will be used to mediate the handover of the data access rights with encryption handled behind the scenes (to ensure UX is seamless). But in the case of a Data NFT, where the actual ownership of a data asset can move from one party to another - the "proof of ownership" will also be transferred. This process will also be handled by the "decentralized middleware" service.
  • Data Vaults will enable a user to "opt-out" of the system in the event they do not want to share their data anymore (e.g. a requirement in GDPR). This is achieved by the above-mentioned "proof of ownership" protocol. Where the unique decrypting key can be "burned" - which then effectively makes all decentralized copies of data (e.g. in IPFS or elsewhere) become untethered from the Data Creator. The data without its decryption key is now effectively just a blob of scrambled text without any identity or utility attached to it. There are of course challenges to this that we need to solve, e.g. what happens if you sell your data and then change your mind after the sale? Do we allow for a recall of data sale? if so, how can we ensure that a user has the ability to completely opt out?

This feature is currently in the "Detailed Design & Prototyping" stage.


Data Streams

You can let buyers subscribe to “personal data streams” — unlike the on-off datasets that can also be purchased on the Data Dex, data streams will continue to feed data once a “subscription” is purchased.

Data streams are a more powerful way for Data Consumers to subscribe to longitudinal datasets that grow over time. For example, health and wellness data like activity, sleep quality, blood pressure, or financial activity like spend habits, etc.

When paired with context-rich data from your “Data Vault” — Data streams become a valuable and steady source of passive income that unlocks in exchange for the sharing of access rights to personal data.

This feature is currently in the "Research:Labs" stage.


Trusted Computation Framework

As personal datasets under the control of Data Coalitions grow over time, certain end use-cases may require access to highly sensitive, identifiable data — often these use-cases will provide the most “payout” for data usage (as such they are considered high-value use-cases). In such situations, a Trusted Computation Framework can be used to ensure computation is handled off-chain with tamper-proof integrity. The Data Coalition will coordinate these computation jobs on-chain (with possible coordination assistance of Chainlink’s Attested Oracles

All personal data traded on the on-chain Data DEX is never stored on-chain — only hash values are stored to ensure the integrity of traded datasets. But in certain advanced use-cases where Data Coalitions manage the data of multiple users, there will be encrypted personal metadata stored on-chain. There will be cases where this data cannot be put on-chain even when encrypted due to privacy regulations, especially if the blockchain network is spread across multiple geographies. Off-chain execution is, in some cases, the only option for processing this data. Trusted Computation Framework can be used to localize the computation of the data to ensure the data storage and processing complies with all data sovereignty regulations.

The Trusted Computation Framework is tethered to the “Regional Decentralization Hub” and is our "Compute-to-Data” solution for highly sensitive data processing requirements within high regulatory environments.

This feature is currently in the "Research:Labs" stage.


Regional Decentralisation Hubs

Highly sensitive data like medical data from hospitals, personal health records, financial transactions, or credit history are protected by regional or local sovereignty laws. To unlock the trade of this data we cannot use fully decentralized global storage or compute. For example; we may want to limit trade, storage, and compute on data to only the EU region so that it complies with laws like GDPR whilst also discouraging a central point-of-attack vector on these data and compute resources.

Regional Decentralized Hubs are a novel idea we are exploring around regional decentralization which balances legal sovereignty laws with personal data sovereignty.

This feature is currently in the "Research:Labs" stage.


The Data Metaverse and NFMe ID Avatar Technology

Image description

NFMe ID (Non-Fungible Me ID) are your Data Avatars of the Metaverse. Join the Data DEX and complete a "seed profiler job" and have your very own NFMe ID minted and stored in your wallet. NFMe ID's have "personal data categories" (PDC) linked to them; these feed data into the NFMe ID's and this data is secured in personal Data Vaults. Example PDCs are social, financial, historical, internal and external.

Apps built on Itheum's Data CAT feed data into the PDCs, these apps are run by Itheum as well as other organizations who want to generate high-value data and then incentivize you to provide them access to your data. Itheum's Personal Data Adaptors can also discover and harvest on-chain and off-chain personal data and lock it inside your Data Vault and link it to your NFMe ID.

As more data is added to the NFMe ID avatar; its "data signature" changes and more "accessories", "evolution traits" and "skins" are made available to your NFMe ID avatar. This is akin to purchasing gaming accessories and traits to augment your in-game NFT characters. Your NFMe ID is organic and evolves like a human as more data is added to it. You can apply "skins" over your base NFMe ID avatar - so you can choose the pick a different visual appearance for your NFMe IF Avatar based on on the situation you want to use your avatar in or mood you want to convey (e.g. you can apply a photorealistic 3D skin, a "cartoonish" skin, or even a completely imaginary robotic skin)

Image description

What can I do with an NFMe ID?

  • NFMe ID avatars are NFT tokens; so it supports all NFT capabilities. But you can never completely sell your NFMe ID, you can only lease it to others and they can use it to access your data. It's an "authorization key" you provide to a 3rd party for gated entry to web3 and metaverse domains and it can also represent and supply a personal data footprint to 3rd party applications (if you choose to allow it to do so). In this way; NFMe ID avatars are soulbound
  • Explore Itheum's 3D Virtual Data Metaverse (called the Greenroom); which is a digital metaverse launchpad where you can manage your "accessories", "evolution traits" and "skins" and interact with other NFMe ID avatars.
  • Participate in the governance of Data Coalition DAOs - These are bulk data trading DAOs that exist to serve the people and protect personal data. Stake and Farm with existing Data Coalition DAOs and provide data curation and data quality assessment services or give them access to your NFMe ID and other data assets and allow them to trade your data on your behalf as part of a larger bulk dataset.
  • "Slice out" certain segments of your data (e.g. Data from a specific app or category) and mint them into Data NFTs and trade these in NFT marketplaces.

In the web2 world; your personal data is exploited by 3rd parties and large corporations. In Itheum's Data Metaverse; we give you true ownership of your data via your own Data Avatar... welcome to the era of the NFMe ID Avatar.

NFMe ID Interoperability

An NFMe ID is your virtual data avatar; so they ideally need to be interoperable with other digital metaverses so you can explore these virtual worlds and ecosystems and interact with more digital assets. Itheum will have its own Virtual Data Metaverse (called the Greenroom) but we are also looking at methods to "port" our NFMe ID avatars into other virtual worlds.

The concept of interoperable metaverse avatars is in its very early stages of development but there is some promising progress being made by projects like WorlWideWeb3 - who will allow for external NFT Avatars to be ported into their virtual world. We will also actively follow the progress in this new space by following the developments of organizations like the Open Metaverse Interoperability Group and also contribute to helping create open standards for future metaverse interoperability.

Itheum may also eventually become the world's 1st fully decentralized open metaverse backed by true data ownership - a vision we hold close to our hearts.

This feature is currently in the "Research:Labs" stage.


Technology Philosophy

The following sections provide an overview of the underlying technology principles that we aim to use as the foundation for our evolving solution design and architecture. It covers elements like multi-chain design, cross-chain communication, and token design principles.


Multi-Chain Strategy

The Itheum Data DEX will work across EVM and non-EVM compatible chains. We already have the Data DEX deployed to the Ethereum, Polygon, BSC, Avalanche, Harmony, PlatON and Parastate (Future Polkadot Parachain) testnets. We are also working on a native middleware interface to enable support for non-EVM blockchains like Elrond, Hedera, Algorand, and others. Currently, data advertised on a particular chain can only be visible to participants who are on the same chain. i.e. a buyer can only buy data advertised for sale on the same chain. This design is called the Multi-Chain Architecture.

We are working on cross-chain advertising and purchasing of data facilitated via cross-chain oracles, relays, and bridges that will allow for the transaction handshake to be done between chains and allow for the data to be verified and transacted as well as for the mint/burn/lock of chain native tokens to balance the maximum token supply across the chains (i.g. burning of Polygon tokens and minting of BSC tokens and vice versa as data is transacted across chains). This will be our Cross-Chain Architecture initiative.

Demo video of the Itheum Data DEX working on Ethereum and Polygon


Multi-Chain Technical Design Goals

  • Itheum is a "protocol" and not just a set of EVM contracts that make up a DApp. This protocol can be implemented on any blockchain platform.
  • The Itheum protocol is built into the Smart Contract layer. The protocol enforces 3 elements needed for data trade to happen on the blockchain; Proof of Lineage, Provenance, and Veracity. We have pioneered a way to deliver this via a nice UX that abstracts all the complexity from the end-user.
  • Therefore the protocol can be implemented in any blockchain and on any native runtime (EVM and non-EVM). As long as the native chain supports transaction consensus and token standards; Itheum can be ported to run on it.
  • Itheum aims to be chain-agnostic and fully interoperable. We follow the same design principles as Chainlink - we define the standard set of rules (i.e. protocol), enforced in immutable Smart Contracts which coordinate how data can be traded between parties. We will have interfaces in multiple blockchains that can coordinate the workload. Data platforms (like Itheum and Chainlink) need to be architectured this way as they can form the foundation for future web3 core infrastructure. This also positions the technology to work across private, permissioned, and public chains to reach maximum adoption.
  • Itheum will have a single "primary chain" and multiple "side chains". The "primary chain" needs the best performance and have cost-efficiency (so we intend to pick the best EVM or non-EMV blockchain as our primary chain). We can then build our side chain Smart Contract Coordinators on any other chain and runtime; and they can talk to each other via bridges, relays, middleware, and oracles.
  • The future of blockchain is interoperability; so we are designing Itheum this way from day one.

Cross Chain Tokens

The core Itheum utility token is an ERC20 token called $ITHEUM. The $ITHEUM token will allow for purchasing access rights for data from Data Creators and Data Coalitions. The $ITHEUM token will also play a role in staking against a Data Coalition and for decentralized governance of the Data Coalition’s responsibilities and actions.

As the Itheum Data DEX operates multi-chain, there will be core tokens that represent each chain with the total supply of $ITHEUM bridged and distributed as more chains are added.

As an example:

  • $ITHEUM — Token deployed on Elrond
  • $eITHEUM — Token deployed on Ethereum
  • $mITHEUM — Token deployed on Polygon/Matic
  • $bITHEUM — Token deployed on Binance Smart Chain (BSC)
  • $aITHEUM — Token deployed on Avalanche

$ITHEUM token can be moved between chains via native bridges that already exist. For example, you can use the native Elrond <> Ethereum token bridge to convert your $ITHEUM to $eITHEUM and vice versa.


Data Trading Fees

Theoretically speaking; Itheum consists of three data marketplaces - The Data Pack, Data NFT, and Data Coalition DAO data marketplaces. Of these three we foresee the Data NFT marketplaces having the highest data trade volume during the initial phases of our platform launch - given that our Data NFT product (and our related NFMe ID avatar product) will appeal to a broad range of users in the NFT, gaming and metaverse ecosystems.

All data traded within Itheum will attract a 2% seller fee and a 2% buyer fee - adding to a total of a 4% marketplace fee per trade. Please note that the buyer or seller will be able to accept each other's fees - so an active data buyer (who may be a large, established enterprise) may choose to take on the entire 4% fee to incentivize more sellers to share data. Alternatively, an active seller may choose to take on the entire 4% fee to incentivize more buyers. We feel this dynamic will create a competitive data marketplace that will facilitate the best deals for both the Data Creator and Data Consumer (This pricing model is similar to the fee model from Rarible which has proven to be very successful)

Example:

  • A Data Creator wraps personal data into a Data NFT and lists it for trade in Itheum's Data NFT Marketplace
  • The Data Creator (Seller) selects a listing price of 100 $ITHEUM for their Data NFT
  • For a Buyer to procure the Data NFT; they will need to pay 102 $ITHEUM to complete the purchase (with 2% buyer commission added to the listing price)
  • Once the trade is complete; the Data Creator gets 98 $ITHEUM (with 2% seller commission passed onto them)
  • Therefore; 4 $ITHEUM fee will be charged in total and split between the Buyer and Data Creator (i.e. 4% of 100 $ITHEUM)

Distribution of Trading Fees & Community Fee Sharing

As mentioned above; 4% of every data trade is charged by the Itheum platform as a data marketplace fee. Itheum will have a generous fee-sharing system in place with community members via staking incentives and also fund an operational budget to ensure the Itheum platform can sustain long-term roadmap development and community growth.

Here is how the 4% will be distributed and used:

  • 2% will be sent to the Foundation Treasury
  • 1% to the Partners program
  • 1% to the Community Treasury

The funds contributed via the marketplace fee to the Community Treasury are sent directly to stakers in the core $ITHEUM Community Staking Pool where relative distributions can be claimed weekly by stakers. (see section titled $ITHEUM staking opportunities)

For details on the "Foundation Treasury" and the "Partners" program and what it will be used for please read the section titled Token Distribution


Itheum Token

The “primary token” will exist on the Elrond blockchain (Note that this may be subject to change as we continue weighing the pros/cons of multi-chain adoption. As low transaction cost, high transaction throughput, and security are very critical for the trading of data; we continue to evaluate the best candidate blockchain for our "primary token".

The primary token will have the token symbol $ITHEUM and the side-chain tokens (called Side Tokens) will have a prefix character in front of the token symbol to identify it (e.g. $mITHEUM, $bITHEUM). This "prefix design decision" is also subject to change as we want to ensure the best user experience (UX) for data trading and we may alter the design to have a single $ITHEUM token that has the same name across various blockchains (Similar to have USDC is called USDC across all blockchains).

The final decision on the “Primary Token Blockchain” will be announced before our Token Generation Event (TGE).


Token Utility

The $ITHEUM token is a pure “utility token” as without owning $ITHEUM you will not be able to use many features on the Itheum Data DEX to facilitate the open exchange of personal data.

At this point, the following tasks will REQUIRE the $ITHEUM token (or corresponding Side Tokens) — as we design and write our smart contracts we will also take an $ITHEUM first approach to align incentives so we can ensure that $ITHEUM’s position as a utility token can’t be disputed:

1) Gaining Access rights to use Data Packs / Data Streams / Data Coalition Data Pools

A Data Creator (who is the original Data Owner of the data) will allow an Access Requester to access their data via a transfer of $ITHEUM between the parties. $ITHEUM is essentially the "software key" to using the data from an authorization perspective. The primary goal would be to use the transaction of $ITHEUM (between the 2 parties as logged on-chain) as:

  • Evidence of access rights being granted by the Data Owner to another party
  • To trace and have evidence of the lineage of data access
  • To trace back access rights to a Provenance (Data Creator)

If the sale is done via a Data Coalition (i.e. as a bulk pool of data), the $ITHEUM is transferred from the Access Requester to the DC DAO and then distributed to the members of the DC as follows:

  • To the Data Creators to indicate access rights to use the data (as mentioned in the above section)

  • Based on the relative bond/stake token contribution of the board members, members, and general stakers to highlight traceability of the effort spent by these parties to coordinate the bulk sale, flag and signal data quality issues and accept the risk of remediating and mediating contentious sales. The transfer of $ITHEUM to these parties is also used as a lineage and audit to trace all parties involved in the data transfer process. See point number (3) below for more details on this.

2) Stake $ITHEUM to have relative voting rights in the Itheum Foundation (IF) DAO

The Itheum Foundation (IF) DAO will vote on "Proposals'' on how the $ITHEUM Treasury will be spent to further Itheum's community development and roadmap. Please note that the full decentralization of the platform and the operation of the voting mechanism will be put into place at a future date. Read more about this in the Decentralized Governance section.

3) Stake/Bond $ITHEUM to create a new Data Coalition (DC) (which is governed by a DAO)

By doing so; you receive relative voting rights in return to manage the operation of the Data Coalition DAO (DC DAO). Staked $ITHEUM goes into the "DC Fund Pool" - which is then used for arbitration and dispute resolution in contentious sales of data. The DC DOA votes on "Motions" related to the bulk sale of data via the DC.

The following parties can Stake/Bond $ITHEUM:

  • Board Members will bond $ITHEUM to flag their commitment to the DC and to act in the best interest of the Members. The bond is locked in until the Board Member leaves the DC.

  • Data Creators or Contributors (called Members) can link their data to a DC and also stake some $ITHEUM to flag that they have a genuine interest in supplying good quality data to the DC.

  • Anyone (even those who do not want to provide data themselves) can stake $ITHEUM into an existing DC DAO. They do this by becoming a "data quality verifier" (resulting in a Crowd-Sourced Data Curation dynamic), effectively also signaling the "genuineness/credibility" of a Data Coalition (similar to how the credibility of a node validator in a Delegated Proof-of-stake (DPOS) system gets signaled by the community who delegates their stake with them). Everyone who stakes/bonds $ITHEUM is incentivized relative to their role and stake and earns micropayments after each trade is finalized. Learn more about the Data Coalition DAO design here


Token Metrics

Image description

High level characteristics of the $ITHEUM token

Image description

$ITHEUM token allocation buckets

Image description

$ITHEUM token TGE unlock & bucket cliff/vests/unlock schedule

Explanation of the Cliff / Vest / Unlock process:

  • Cliff (C): The period when tokens are fully locked.
  • Vest (V): After the cliff is over, then the tokens can be claimed. The claimable tokens are then allocated at a fixed schedule over time. This is called a vesting schedule.
  • Unlock (U): The vested/allocated tokens then unlocks at certain points. This is when the token holder can get access to their vested tokens.

Let's consider the following example:

Private Sale: 20% of max supply : V:24m / U:3m : TGE unlock 5%

  1. Receive 5% of their tokens at the token generation event (TGE), which occurred on March 28, 2022.

  2. The remaining 95% of their tokens are released progressively over a vesting period of 24 months (V:24m).

  3. During the 24-month vesting period, investors will be able to get access to their tokens in 3 months blocks - this is called the unlock period (U:3m). The 1st unlock date for the investors will be June 28, 2022.

Unlock Dates:

  • June 28, 2022 (start)
  • Sep 28, 2022
  • Dec 28, 2022
  • Mar 28, 2023
  • Jun 28, 2023
  • Sep 28, 2023
  • Dec 28, 2023
  • March 28, 2024 (end)

Why did Itheum design the token distribution schedule this way?

  • Itheum needs a long-term commitment from our supporters and this assures we get this from the team, investors, and advisors.
  • Itheum's token is a utility token that is used to "price" the access to data, effectively setting a "value" for data. To ensure there is steady long-term growth and stability for the utility token we need to balance token supply and demand. We also need to ensure that there is a predictable injection of new token supply into the market. This is why we put in place the "3-month" unlock mechanism that applies after the cliff and vest dates pass.

Token Distribution

As with all blockchain-based projects; Itheum will have an allocation of tokens that are not for sale. These non-sale tokens will be allocated to various buckets to oversee the roadmap, community, partners, liquidity, security, and various incentives needed for the continued growth of the Itheum ecosystem and to ensure we build a platform that becomes a market leader, brings the best value to the community and eventually be governed by the community. The non-sale allocation groups and their initiatives are as follows (please be aware that the full initiatives under the groups are still an evolving topic of discussion but we aim to cover the following key items as a minimum):

Community (17%) & Ecosystem (23.5%)

  • Ambassador programs
  • Community growth campaigns
  • Community education campaigns
  • Community feature testing rewards (Beta feature user testing)
  • Development grants, hackathons & incentives
  • Grants to run Data Coalition DAOs
  • Grants to build apps on Itheum's Data CAT
  • Ongoing multi-chain liquidity reserves
  • Technical audits and white-hat hacking bounty programs, code reviews, and platform stability incentives
  • Home of the Community Treasury: holds the percentage of funds earned from marketplace fees that will be reserved for long term $ITHEUM stakers
  • Support "Partners"

Partners (2.5% - this is part of the Ecosystem bucket)

  • Key partnerships to enable "data market-making" (i.e. to ensure there is a constant demand for data)
  • Recruitment of a consumer data protection guild - These are global SMEs (Subject Matter Experts) in data privacy, data regulation, personal data protection advocacy, academia, and law. The consumer data protection guild will be available for Data Coalitions DOAs to utilize as needed to ensure that data trade happening via Itheum is always in the best interest of all parties' rights
  • Funds for key operational & marketing supporters and ambassadors

Treasury (17%)

  • Home of the Foundation Treasury: funds are used to support Itheum's roadmap for the long term and will be eventually be controlled by the Itheum Foundation DAO
  • Funds to cover unexpected expenses that might come up relating to platform operations, stability, security, and growth

Team & Advisors - 17.5%

  • Team (12%): Incentivise the core team and new team members to work and deliver the initial roadmap (until the community-run Itheum Foundation DAO is put into place)
  • Advisors (5.5%): Strategic advisors who will help Itheum bootstrap and reach optimum global growth

$ITHEUM Staking Opportunities

Itheum will have multiple opportunities for staking; this is to ensure we incentivize long-term HODLers who will, in turn, ensure that the Itheum network has price stability and high participation.

  1. Data Coalition DAO Staking: Data Coalition DAOs will allow for people to stake $ITHEUM against them to show support for their data trading operations and to flag support for data quality and credibility. Data Coalition DAO stakers earn a percentage from bulk trades that happen via that Data Coalition. Learn more about this in the Data Coalitions section.

  2. Community Staking Pool: Itheum will also have an internal, general-purpose staking pool called the "Community Staking Pool". $ITHEUM stakers in pool will earn a share of the data marketplace fees (see the section titled Distribution of Trading Fees & Community Fee Sharing)

  3. Core Protocol Staking: The locking of tokens via staking incentives will lay strong network security dynamic for key features that we intend to roll out in the future; like our decentralized key pools (to power the per-user encrypted Data Vault feature) and privacy-preserving compute and regional decentralized hubs - key infrastructure that may interface with decentralized node operators and 3rd party specialized protocols to function.


Decentralized Governance

The Itheum platform will aim to be a public-goods platform that's fully decentralized. Public goods in the sense that all the technological developments that are made as part of Itheum's vision will always be available in the public domain and not privatized in any way. Although it will take some time to fully get there; especially in the area of our web2 data on-ramp/bridge technology (e.g. Data CAT). But it's worth noting that our web3 technology stacks will be fully available in the public domain from day one.

With our technology deliverables aligned to be made available as a fully public-goods platform, the next important aspect to decentralization is to have our technology development roadmap DAO controlled with $ITHEUM token holders being able to collectively and fairly decide on the technology strategy and direction of the roadmap delivery. It's worth noting that "fully DAO controlled" platforms are complex to set up and will require some platform operating maturity before implementing, but the ultimate intention of Itheum is full decentralization and this will happen progressively over time to ensure the Itheum platform will remain in the hands of the public - but at the same time, be a robust operating technology solution that will be around for the next 50 years. Until the platform roadmap is progressively transformed into a fully DOA-controlled element, the Core Team will manage the prioritization and delivery of the roadmap items with some pathways detailed on how the platform will transition to decentralization. This is described below in the "Foundation DAO" section.

Once the Itheum platform transforms into having its operations and roadmap strategy DAO controlled, the intention is to have the native token ($ITHEUM) be the governance token. New advanced yet seamless DAO schemes will be built around the $ITHEUM token that will increase the utility of the $ITHEUM token and also ensure the best user experience for our platform users and token holders. People who own $ITHEUM will be able to participate in DAO votes and collectively make decisions on the future of Itheum.

Itheum has 2 high-level forms of DAO schemes that will be implemented in our platform:

  1. Foundation DAO: This is the platform governance DAO that will eventually be responsible for the future direction of Itheum's technology strategy and roadmap.

  2. Data Coalitions DAO: Unlike the single Foundation DAO, there can be multiple Data Coalition DOAs. Each time a new Data Coalition is set up by the public and the structure reaches an "operating mode" it can be considered to be an independent DAO. The overarching parameters of the Data Coalitions DAO will be controlled by the Foundation DAO. So effectively, the Foundation DAO is considered to be a DAO of DAOs.

Platform Governance - Foundation DAO

As detailed above; the Foundation DAO will be responsible for the core platform's governance activities. Until the Itheum platform reaches the operational maturity required to fully decentralize the Foundation DAO, the Core Team will provide the governance in a way that's fully open to public visibility and accountability.

Operational Maturity

In this section, we will detail what it means when we say operational maturity and this dictates when Itheum will progressively transition to a fully DAO controlled Foundation DAO. For an ambitious platform like Itheum to gain widespread mainstream adoption and deliver robust technology solutions amid high competition from other commercial and public organizations - we will need to have fast iterative delivery of roadmap items and make objective decisions to ensure we get Itheum to the Mainnet in a state that makes it the number 1 data platform for web3.

It's a well-known observation that putting in a fully decentralized DAO too early can slow down decision making and delay competitive timelines and therefore we need to be cognisant and pragmatic on how early we embrace full decentralization, as failure to do so will put the entire Itheum platform delivery at risk and affect all of the Itheum community and token holders. Once Itheum has been deployed to the Mainnet and the day-to-day operations of the platform are in a stable and controlled state and we are ready to move into iterative continuous improvements; we will then begin rolling out the Foundation DAO schemes and start opening up public voting for further roadmap upgrades. Such an approach will ensure the long-term success of Itheum and is in the best interests of the entire community.

Roadmap milestones

The below details Itheum's roadmap milestones:

Image description

Roadmap methodology

Itheum's roadmap has always been public and is available for everyone to view. It can be accessed here: https://itheum.io/roadmap

As seen in the roadmap board; all items that need to be considered to be included in the roadmap are added here and are voted on by the Foundation DAO to transition them to the appropriate lanes for delivery. We follow the Kanban methodology to manage the transition of items from inception to delivery. An idea is registered in the IdeaBox and can then move to Research and Development after the Foundation DAO voting agrees that the core team can spend the time and effort to commence R&D activity on the task. Once R&D is complete and the team is ready to ready to begin design and estimation; the task will then move to the Estimating lane. The Foundation DAO then can decide when to schedule the work. Once it's ready to start the development phase it will then move to the "Sprint Candidate" lane and then development and testing begin. The final lane is called shipped and an item moves there once it's deployed to production (and/or mainnet).

DAO Technology Schemes

The Foundation DAO feature is still under development but we detail our current design schemes goals below, please note that this is subject to change as we iterate on our design and we will continue to keep this section of the whitepaper updated with any changes and also inform our community via our channels of any changes.

  • The $ITHEUM token also plays the role of a Governance Token. There won't be any other, dedicated governance token in Itheum. This ensures that the $ITHEUM token has more utility in the Itheum ecosystem and that there also won't be a proliferation of bespoke governance and other tokens in Itheum - this boosts the user experience of the platform as it keeps things simple.
  • Voting is by Quorum + Direct Democracy
  • $ITHEUM holders can stake their tokens into the Governance contract and in turn, they will be able to vote on proposals
  • The weight of each user’s vote is proportionate to the number of tokens they have staked
  • Users can exist their stake anytime, but their vote will be withdrawn if they exit during an active voting round
  • Proposals can be decisions to roadmap updates or changes to core platform parameters
  • Core Platform Parameters can be tasks like Update Quorum %, Approve new Data Coalition applications, Manage core Data Coalition parameters (min:max members / min fees to join), Setting Harberger Tax rate on Coalitions managing Data NFTs, Manage Key-pool parameters for Personal Data Vault nodes (max / min / rotation / bonds), Manage node parameters for Trusted Computation framework (max / min / rotation / bonds) etc.
  • Roadmap updates are proposals to move the development roadmap future (see above in Roadmap methodology section)
  • Potential issues with this scheme that we will need to design around are Governance Locks if a quorum is not reached and whale dominance

Data Coalition DAOs

Details about the Data Coalition DAOs are explained in the above section titled Data Coalitions, but we will summarise the DAO scheme below to lay out the various features of this component. Please note that this is subject to change as we iterate on our design and we will continue to keep this section of the whitepaper updated with any changes and also inform our community via our channels of any changes.

  • These DAOs are generated each time a new Data Coalition is formed and approved by the Foundation DAO to operate.
  • These DAOs are programmatically built via a factory contract that generates the base Data Coalition DAO according to the core parameters (these parameters can be altered later by the Board Members of the DOA).
  • Anyone can form a Data Coalition DAO by bonding $ITHEUM. The creator is called the Chairman but they don't have any special rights. They have the same rights as Board Members.
  • Once the Chairman creates the DAO, it goes into a phase where new Board Members need to be recruited. Board Members also need to bond $ITHEUM and be voted in by the other Board Members. This is akin to Permissioned entry where existing Board Members need to recommend you via a Motion
  • Only Board Members can vote on Motions; which follows a representative democracy scheme. A Motion can be anything that ranges from adding more Board Members to changing the parameters of DAO (min:max members / min fees to join, terms of sale, sale price etc) and also to agree on which purchase request to approve (i.e. who to sell data to).
  • Voting is by Simple Majority (no quorum needed), this allows for fast decisions to be made on potential new data sales.
  • All funds raised via bonds and stakes go into a DC Fund Pool which is then used for arbitration and dispute resolution in contentious sales of data.
  • The DC Fund Pool will be controlled by a Multi-Sig Wallet that will require a minimum set of Board Member signatures to process transactions.
  • Once the minimum Board members have joined the Data Coalition, it will enter into "operational mode" where it can start accepting Members.
  • Board Members will provide some public profile information to provide transparency on who they are, this is to provide some information for future Members to make informed decisions on if they should join the Coalition. This is very similar to how the Delegated Proof of Stake validator selection process works, where you can stake your tokens after doing some due diligence on the validator's reputation and past performance.
  • Anyone can link their data to a Data Coalition and join as a Member. They can also choose to Stake $ITHEUM along with their data to provide more guarantee that they are aligned to the long-term mission of the Data Coalition.
  • Everyone who joins the Data Coalition (Board Members and Members who contribute data) - starts with a low reputation score that builds up over time with each successful data trade.
  • Data Coalitions also allow for "pure staking" of $ITHEUM, where anyone can stake their $ITHEUM into a Coalition (you don't HAVE to provide your data) to flag their support for the Data Coalition and to signal that the data within the Coalition is good (Crowd-Sourced Data Curation). They are also considered to be a Member.
  • Members who staked $ITHEUM and Board Members get a majority share of each sale. A minority share is available for pure stakers, data providers who did not “stake” and/or who have a low reputation
  • Members can exit anytime but Board Members need a Motion passed to leave or be replaced. Board Members also need to wait until the bond period ends to exit.
  • Although not mandatory, Members will be able to participate in ongoing period votes to express their satisfaction with the Board's performance. If satisfaction rates are low for multiple voting points - this will trigger a board rotation clause.
  • Although not mandatory, Members will be able to expect the Board to have a fixed term, after which they will need to rotate out and be replaced with a new board.
  • A Data Coalition can only be shut down if it's not operational, if it does not have any outstanding disputes to resolve, all Board Members and Members have been compensated for any sales made in the past. The final decision will be made by the Foundation DAO to terminate the operation.

Fraud Detection — “Gaming” the system

As detailed in the above sections, the Itheum Data DEX allows for the trade of personal data. The reward for each trade will be distributed in the platform’s native $ITHEUM utility tokens. As the demand for $ITHEUM tokens grows we anticipate there will be malicious individuals or parties that try and “game the system” to obtain $ITHEUM tokens or disrupt the market activity of legitimate data trade.

Types of Attacks

The following are some attack scenarios we anticipate:

1. Selling Fake Data

A malicious user or a botnet can potentially spin up 100s of addresses and upload fake data files. The intention here would be to spray attack dummy datasets and mask them as legitimate datasets. E.g. If a Data Coalition that aligns with the “sale of health data for commercial use-cases” has a high return in terms of sales — the malicious party can upload fake data and tag it as legitimate blood pressure readings. The malicious party can then align to the above Data Coalition and the data is then piped into the Data Coalition’s data pool. This kind of attack will diminish the data quality of the overall Data Coalition as buyers will rate the pool data quality as low and/or request refunds. The malicious party would intend to attempt to pass for legitimate data and in return earn some $ITHEUM tokens before the act is discovered and blacklisted.

The attack can also happen in a direct peer-to-peer sale method, where the malicious user uploads fake data, write up an appealing and legitimate data preview headline and hope that a buyer will be tricked into making the $ITHEUM transfer before discovering the data is fake.

2) Selling “Doctored Data”

This is similar to the above attack but instead of uploading and selling fully fake data, the Data Creator/Seller doctors or manufactures data to look like it’s accurate. For example, as blood pressure data has a standard “mask” (e.g. 123/89) they can generate some random data that looks like accurate data (134/32, 123/90). They can use scripting to generate bulk quantities of doctored data and automate the upload and sale of it as described in the above section (i.e. via a Data Coalition or direct peer-to-peer sale)

3) DDOS/Spam via Fake Orders

This is very similar to the attack described in the "Selling Fake Data" section but the intention here is more to disrupt the natural marketplace activity by spamming it with fake data orders that make it impossible for legitimate orders to be viewed and purchased. The attacker/s can spin up multiple identities and advertise small bits of data for sale until they pollute the marketplace or the Data Coalition’s data pools with fake activity.

4) Uploading Irrelevant/Inappropriate Data

In this type of attack, a malicious user will attempt to bring reputation damage to the Itheum Data DEX by uploading irrelevant or inappropriate data. They may masquerade as a legitimate user and then disrupt the credibility of the platform by uploading bad data that will be seen by other legitimate users of the platform. This may also bring about regulation concerns if the data is especially inappropriate. Over time this type of attack will "spam" the natural, organic network activity and degrade the reputation of the Itheum Data DEX.

5) Sybil Attack

This type of attack is something all distributed systems and blockchains are prone to and involves malicious parties using “multiple identities” to flood a chain to take control of governance or core functionalities (like producing blocks in a blockchain or voting in their fake proposals). The Itheum Data DEX is built on EVM and non-EVM compatible chains (Elrond, Ethereum, Polygon, etc) — so the inherent risk of Sybil attacks on the core chains will impact Itheum’s data trade activity. The Itheum Data DEX is also prone to Sybil Attacks of a different type; where a malicious user, masquerading as a Data Creator/Seller can spin up multiple identities and flood the marketplace with fake orders or attempt to take control of the governance and voting activities that are core to the Data Coalitions.

Methods for Prevention

The following methods are (or will be) implemented to mitigate the above attack vectors.

  • The selling of personal data (either via direct peer-to-peer sales or via a Data Coalition) involves an “advertising process” (described in a section above) — where the integrity of the raw data is advertised on-chain to facilitate the trade process. As this is a “write-transaction” it does require GAS to complete. If a malicious user is attempting to spam or dump fake datasets for sale on the marketplace, the GAS cost implication will make it impractical. (this is especially true if the chain is Ethereum, but for low GAS fee chains like Elrond or Polygon — the risk remains but at lower levels). This is applicable for direct peer-to-peer sales of Data Packs as well as for sales of Data NFTs.
  • Currently, a Data Preview is attached to each Data Pack and Data NFT sale offer. When a user puts “Data Previews” they are currently manually entered, this allows for a malicious user to put in a fake preview and upload fake data.

    Image description

    Data Previews can be “Gamed” as it’s manually entered

We are working on having a new metadata field called “data snapshot” which will focus on a specific, random part of the uploaded file or stream and attach that to the dataset order. To prevent the “data snapshot” exposing any sensitive data; upon upload of the data file by the seller they will be provided with a few “snapshots” and they will be able to pick one that they prefer (similar to how you can pick a generated thumbnail for a video uploaded in YouTube)

  • Data Creators/Sellers can align to a Data Coalition to leverage the “power selling” of datasets. To align to a Data Coalition and then pipe their data to the Coalition for sale, the Data Creator/Seller will need to stake some $ITHEUM against the Coalition DAO. The more they stake, the higher the data quality score is attached to the origin Data Creator. This puts a “skin in the game” vector for the Data Creator/Seller to behave in ways that don’t end up them having to be penalized and have their $ITHEUM revoked.
  • Staking/Voting against a Data Asset to boost credibility: We are looking at methods to attach community staking and/or voting against data assets that are put for sale. This may be voting that happens as part of the Data Coalition (i.e. if a specific user’s data is gaining more demand, the DAO upvotes the user’s credibility). As the core credibility grows the value of the user’s data (in terms of $ITHEUM cost) also increases. Based on this method, as a new Seller (i.e. new’ish Chain Address) tries to sell data, the credibility score will be extremely low and therefore the return will be less — this provides the organic incentive needed for legitimate behavior as the best way to earn more $ITHEUM would be to upload legitimate data, align to strong Data Coalitions and gain organic credibility over time.

This approach will organically generate a sort of confidence score for each dataset or data NFT being sold on the marketplace. This is very similar to the OpenSea Confidence Score dialog alert you see when you attempt to buy NFTs.

Image description

OpenSea Confidence Score notification for purchases

  • Multi-Account Detection and Blacklisting: We currently use Moralis for user account management. Moralis allows for multiple addresses to be linked behind the scenes to a single user. This enables us to have an audit trail of users trying to launch any potential address-based Sybil attacks and allows us to blacklist transactions in near real-time.
  • Verified Identity and/or KYC of Data Seller: One method to prevent most attacks described above would be to use a Verified Identity or KYC platform to ensure that the Data Seller has gone through a verification process before they can sell their data. To prevent a high barrier to entry and boost adoption with a better UX, we may allow for an “unverified” sale of data which will allow for the sale of data but with limited functionality in the DEX and/or much lower $ITHEUM earnings (this is to incentivize the user to verify — this is similar to CEX’s like Binance which allow for trading smaller amount without verification). We are also investigating the BrightID project for fully decentralized proof of uniqueness and other mainstream KYC platforms like CIVIC as potential solutions.
  • To increase the legitimacy of Data DEX user accounts and to prevent the attacks that we describe above; we are also considering using a mobile phone number as a 2nd factor on the user identity record. This solution path makes sense should we not want to proceed with a full KYC tool and to yet ensure each user has a 2nd factor they need to prove they own before using the Data DEX. As obtaining a mobile number has compliance and audit trail attached to it (you need to provide your driver's license or prove your identity and address for example), this makes attacking the system very impractical as each user will generally only have access to one mobile number (of course, there are exceptions to this but a single user can't own hundreds of mobile numbers for example).
  • To prevent dumb botnet attacks we will implement "invisible CAPTCHA". These are modern captcha technologies that don't have the bad UX of traditional captcha (i.e. click a button on select patterns from photos) but can weed out fraudulent automated users or bots with very high accuracy.

We are sure that there will be many more methods malicious users will use to try and game and disrupt the Itheum platform, Data DEX, and data marketplace ecosystem. As part of our core governance model and token distribution, we will reserve a potion for the funds to investigate new attack vectors and incentivize this type of white-hat hacking and remediation behavior to continue to boost the security of our platform.


Data Collection & Analytics Toolkit Components (Web2 Data Bridge)

The following are the key components that come together to make up the Data CAT.


Smart Data Types

The core element of the Data CAT is a concept called “Smart Data Types”. There are core building blocks that you can use to build advanced data collection apps. Think of them as the “composable elements” that can be used to generate high-value data.

Smart Data Types is a revolutionary concept for data collection that’s native to the Itheum platform. The core Itheum team will regularly build new Smart Data Types and add them to the library. There will also be a public marketplace of 3rd party Smart Data Types produced by our community. 3rd party Smart Data Type developers will be incentivized to contribute their work with token incentives and grants released by the "Itheum Ecosystem Fund".

Here is an example of a Smart Data Type that is available for use today for the Health and Wellness industry: https://itheum.com/smartdatatypes/blood-pressure-left-arm-sitting

Here are all the other available Smart Data Types: https://itheum.com/smartdatatypes

App Builder

The Data CAT provides interfaces for app developers to build highly flexible data collection and analytics experiences. Apps can be built using our "no-code" toolkit available in the "Management Portal". The "no-code" approach allows for a click and build approach to data collection and analytics, where you can reuse prebuilt templates for scheduled data collection or build out your own custom schedules and include tools from an already built collection of Smart Data Types, engagement channels (e.g. Email, SMS, Telegram), data visualizations, machine learning analysis layers (e.g. sentiment analysis), reports, alerts/thresholds (e.g. irregular data pattern detection), video education and many more tools.

You can also freely "clone" existing apps and build on top of them. As an example; if you like the OKPulse employee wellness app - you can clone it and build on top of it by adding new Smart Data Types and data visualizations. We are also working on releasing an SDK and API for people who would like to build apps on Itheum using a programming interface.

Omnichannel Data Collection

To reach the broadest user base and bridge as much highly structured, outcome-oriented data from the web2 world into the web3 world; Itheum aims to support a wide range of data collection "channels". Channels are pathways from which data is actively and passively collected from users.

Itheum currently supports automated data collection via email, SMS, Telegram, and Facebook Messenger. We are also actively working to support Slack, Voice (via automated telephony calling and speech-to-text collection of data), and WhatsApp. Itheum is also building an iOS and Android mobile app that will be able to integrate with your mobile platform's existing secure data vaults (Apple Health and Google Fit) and automate the collection of valuable health and wellness data.

The platform will also support various 3rd party, on-chain, and off-chain Data Adaptors [Fitbit, Garmin, Open APIs, The Graph, ChainLink, etc)

Automated Data Collection Scheduler

Apps built on the Itheum will actively and passively collect data from users and bridge seamlessly this data into web3. Highly structured, outcome-oriented data can be collected by using Smart Data Types and these data types usually need to be collected at a pre-defined, structured schedule for them to provide the most value.

Using the built-in `Automated Data Collection Scheduler, you can schedule data collection based on granular conditions and timetables. You can pick the day and time data needs to be collected, you can skip certain days or collect weekly, forth-nightly, or monthly, you can trigger other specific data to be collected based on results of a previous data point or result and also use other customized logic for scheduled data collection.

For example; in the "Red Heart Challenge" program - blood pressure from your left arm is collected daily but on the weekends, blood pressure data is collected from your right arm. Using this schedule, we can derive using data analytics on the "variance of left vs right arm collection" and "weekday vs weekend collection". The automated Data Collection Scheduler provides the ability to enable rich data collection and increase the value of the final data.

User Portal

Every user who joins and participates in Itheum programs (e.g. Red heart Challenge, OKPulse) will get access to their very own user portal. The user portal provides them with full visibility on the data that have been collected on them and will provide the user with the tools to edit or delete data at any time. The user portal enables full visibility, management, and control of the user's interactions with Itheum Apps.

Along with providing users with control over their app subscriptions and data, the portal can also be used by users to manage their account preferences, watch education modules (see below), participate in rewards programs (swap engagement reward points for gift cards), and join new programs.

The user portal will also be the central hub for "data and privacy control and regulation", where the user can opt into having their data stored in specific regions or to even completely exit from the Itheum ecosystem and request a full delete of their data assets and history (e.g. "forget-me"). In future; When Data Coalitions provide services like "delegated data usage policy approver" - the user portal can use used by the user to manage this delegation interface.

Powerful Analytics and Insights Engine

As described above in the Smart Data Types section, each Data Type will have built-in composable analytics modules. These modules are built by developers and data analysts and will overlay on top of all the collected data to provide unparalleled insights. As these analytics modules are composable, you can piece them together to generate powerful analytics and gain insights into the data collected via your apps.

The insights engine can also make use of some core analytics modules built by the core Itheum team. An example of such a module is the AI Sentiment Analysis layer - this layer can be applied to any user-entered input collected by your apps. In the OkPulse App, the AI Sentiment Analysis module is used to analyze the written sentiment of an Employee as they interact with the app. The output from this layer is then used to detect levels of stress and anxiety among employees as they respond to certain data collection metrics.

Management Portal

Users who build and run Itheum Apps will get access to a management portal, this is akin to an admin portal that provides full management features over the apps they are running and the users who are enrolled in those apps. This portal will allow an admin user the tools to manage their Itheum apps (update app configurations and settings, add new apps, change engagement channels for data collation, upload new education modules, etc) and also manage their users (manually onboard new users, change password and access control for users, see user progress, add users to new programs, etc).

The portal will also provide comprehensive data reporting features that allow the user to view cutting-edge data analytics which are produced by the Powerful analytics and insights engine. The management portal is the "control panel" for everyone who runs Itheum apps.

Built-In Video Education Studio

Itheum aims to generate very high-quality personal data but this can be a challenge as the problems of bad data quality are very common in the data tech industry. Especially when the data is generated by an end-user and not a machine (like a sensor). For example; in the health and wellness industry - the problems associated with bad "patient-generated data" makes it very hard for clinical staff to make automated diagnosis or dispositions, as there is a high possibility that the data might have been incorrectly generated or entered by the end-user (patient). These problems are not unique to the health and wellness industry and are common across any sector where end-users are responsible for generating and submitting their data.

To improve the quality of data that is produced by Itheum apps, we have implemented an "education layer" that's built into the Itheum platform. App developers can create or reuse education videos that guide the end-user on the proper methods of data generation and collection. In the "Red Heart Challenge" app, for example, all the users of the app are presented with educational videos on the correct method to take blood pressure from their arms whilst at home and how they should submit it.

the Itheum Management portal has a video studio that allows you to record videos or reuse videos from public channels like YouTube. Users are also rewarded for watching these videos, with engagement points being handed to users as they complete video education modules. With users earning rewards to be educated of data collection, their engagement rates increase and Itheum generates higher quality data. In the future, Itheum will also allow 3rd party education content provides to generate education videos for use within the Itheum App ecosystem.

Dispense Rewards for Compliance

The data collection apps built on Itheum aim to have built-in mechanisms to boost user engagement and compliance. As higher user engagement leads to higher quantities of data, these incentive mechanisms are invaluable to the system. Users who join the Data DEX to connect their Itheum Apps and then bridge and trade their data on the blockchain can earn $ITHEUM tokens in return for access to their data. But this requires users to have web3 experience and it can be a limiting factor initially where Itheum's apps may have many users who are yet to enter the web3 technology ecosystem (wallet usage, token purchase on DEX/CEX, etc).

To enable the widest adoption of Itheum in web2 and web3; the Itheum platform also introduces a web2 rewards system - where users earn "traditional rewards points" as they engage with the apps they join. Users can earn points for completing data collection tasks on time, watching education videos, logging into the platform to check progress, etc. This reward system is very similar to the mainstream royalty points system that users will be familiar with, where you can earn points for credit card usage, frequent travel, and shopping at the grocery stores. The Itheum app platform's rewards points can then be redeemed in the reward store for real-world gift cards or redeemed for $ITHEUM tokens in the Data DEX.

Full White label Support

Apps developed on the Data CAT will be built by commercial enterprises and also academic and research institutes. We have seen this already with OKPulse, which was built by a corporate wellness institute, and Red Heart Challenge which was co-designed and built by an academic institute. These entities will usually have a target use case they'd like to engage with and often require the entire app experience and data collection communication to be "white-labeled".

White Labelling includes using your own domain names, logos and branding, email and mobile numbers, support ticketing systems, etc. By white labeling an Itheum App, app developers can boost adoption with their users as the entire user experience will be branded to something they can be trusted. To enable this requirement we have built-in full white label support for Itheum Apps and feel that this is a huge selling point for mass adoption in both the commercial and non-commercial sectors.

App & Smart Data Type Marketplace

Anyone can participate in the Data CAT ecosystem and build Smart Data Types and Apps. For Apps built for use within the not-for-profit and academic sectors, Itheum will offer the toolkit free of charge. For commercial use-cases, Itheum will charge a nominal fee (we will offer it as a Platform-as-a-service with a nominal fee that ensures that the price point is accessible to small-to-medium businesses to ensure they can participate in the personal data collection industry).

As detailed above, all Smart Data Types and Apps are "modular"; which means that you can clone them and build on top of them. This makes Itheum highly composable and extensible and will provide huge opportunities for innovative data collection and analytics solutions to come into existence.

All-new Smart Data Types and Apps build built on Itheum will be available in an Itheum Marketplace. Most of them will be available free of charge, but we will also provide a mechanism for Smart Data Types and App developers to build solutions that can be sold or licensed. This provides a dynamic marketplace for data collection and analytics innovation and an opportunity for 3rd party developers, data analysis, data scientists, product owners to earn some income.


Key Terms of Reference

  • Data Creator: The type of user who generates the raw data (i.e. the data is an extension to them and would not exist if they did not generate it). This is usually the user who uses an app built using Itheum’s Data CAT.
  • Data Seller: A Data Creator who uses the DEX to sell their data either as a direct Data Pack sale or Data NFT.
  • Data Owner: The type of user who owns a piece of data that’s available in the decentralized data DEX/marketplace. The Owner does not have to be the Data Creator, it can be someone who purchased a Data Pack or a Data NFT can prove access ownership of that asset on the blockchain. So for each Data Pack, for example, there will be 1 Data Creator and potentially multiple, unlimited (n∞) Data Owners (the users who procured access to the original data pack). And for each Data NFT for example, there will be 1 Data Creator and a single (n1)or multiple, limited (nX) Data Owners as Data NFTs can have a limited supply of 1 — x.
  • Buyer (Access Requester): The type of user who logs into the Data DEX and buys access to a Data Pack or Data NFT. They then become a Data Owner. They are also referred to as an Access Requester as they are technically not "buying data to fully own it", instead they are only "buying access rights to use a copy of the data".
  • Data Pack: The simplest dataset that can be sold on the Data DEX. It’s a file in JSON format that can be put on sale by the Data Creator and then gets advertised for sale in the marketplace. The sale happens directly between Creator and Buyer with no intermediaries.
  • Data Order: When a buyer buys a Data Pack a Data Order is created. This has some metadata around the transaction, this information is kept off-chain (to conserve on-chain resources and keep costs low) but can be verified to be accurate on-chain.
  • Data NFT: The alternate way to sell data is by wrapping it in an NFT and then trading it as a regular NFT product in the NFT marketplace.

Discussion (1)

Collapse
zzaaiinnzen profile image
Zain Khan

The Itheum folks are amazing!!!!