DEV Community: YUE GUO

NebulaGraph Cloud on AWS：Auto Scale, Start Free

YUE GUO — Thu, 05 Sep 2024 02:22:44 +0000

Summary

NebulaGraph is an open-source, distributed, and easily scalable native graph database capable of hosting very large datasets containing hundreds of billions of points and trillions of edges and providing millisecond queries.
Thanks to their unique data model and efficient query performance, graph databases play an important role in many business scenarios. Here are several typical business applications of graph databases:
●Knowledge Graph: Businesses and organizations need to build domain knowledge graphs from various data silos to support intelligent Q&A and semantic search.
●Social Network Analysis: Social network platforms need to analyze relationships, interests, and interactions between users to offer personalized recommendations and targeted advertisements.
●Financial Risk Control: Financial institutions need to monitor anomalous transaction activity, detect potential fraud, and assess credit risk.
●Recommendation system: E-commerce and media platforms need to provide personalized product or content recommendations based on the user's browsing history, purchase records, and other information.
NebulaGraph Cloud is a fully managed cloud service designed for NebulaGraph, and this article will detail how to realize cost reductions and efficiencies in synergy with AWS offerings. (For further instructions, please refer to the AWS Marketplace product landing page:
https://aws.amazon.com/marketplace/pp/prodview-wctboqxrwqkjy?sr=0-1&ref=beagle&applicationId=AWSMPContessa)

Platform architecture and components

Platform architecture

NebulaGraph Cloud is designed and built on K8S with the following considerations:
●K8S's Operator mode enables the automation of all internal cluster activities, significantly reducing development costs.
●K8S offers a range of abstraction capabilities to streamline the management of compute, storage, and network resources, which can help users avoid vendor lock-in and achieve complete cloud-agnostic
●K8S's inherent features such as scaling up and down, scheduling framework, resource monitoring, and service discovery simplify various management operations.

Service component

NebulaGraph Cloud is composed of two main parts:
Control Plane
The control plane provides user console access for organizational management, user rights control, database management, network security settings, monitoring, alerting, and various other features designed to assist users in managing their NebulaGraph clusters.
Data Plane
The data plane receives commands from the control plane, including resource scheduling, provisioning, versioning, horizontal scaling, specification adjustments, backup and recovery, observability, metering, and additional functions._

Cost-optimized design

Illustrated with a sample database configuration:_

apiVersion: apps.nebula-cloud.io/v1alpha1
kind: Database
metadata:
  name: db123456
  namespace: db123456
spec:
  provider: aws
  region: "us-east-2"
  k8sRef:
     name: "k8s-ml65nmjg"
     namespace: "k8s-ml65nmjg"
  tier: standard
  graphInstanceType: NG.C4.M32.D0
  graphNodes: 1
  StorageInstanceType: NG.C4.M32.D50
  StorageNodes: 1
  version: v3.8.2c

The database instance is defined using a Custom Resource Definition (CRD), and a database controller manages its operations within the control plane. Dependent compute resources are defined using the CRD NodePool, which is an abstraction of the compute resources of each cloud vendor, and is used to manage NodeGroups in AWS's EKS.As previously mentioned, NebulaGraph is a compute-storage separated architecture, so a database instance will correspond to two NodePool objects.

# Graph resource pool
apiVersion: apps.nebula-cloud.io/v1alpha1
kind: NodePool
metadata:
  name: gnp-7d0b156e
  namespace: db123456
spec:
  databaseRef:
    name: db123456
    namespace: db123456
  instanceCount: 1
  instanceType: x
  k8sRef:
    name: k8s-ml65nmjg
    namespace: k8s-ml65nmjg
  labels:
    platform.nebula-cloud.io/database-name: db123456
    platform.nebula-cloud.io/graph-pool: db123456-f24a
  provider: aws
  region: us-east-2
  zoneIndex: 0
# Storage resource pool
apiVersion: apps.nebula-cloud.io/v1alpha1
kind: NodePool
metadata:
  name: snp-59e9cc63
  namespace: db123456
spec:
  databaseRef:
    name: db123456
    namespace: db123456
  instanceCount: 1
  instanceType: x
  k8sRef:
    name: k8s-ml65nmjg
    namespace: k8s-ml65nmjg
  labels:
    platform.nebula-cloud.io/database-name: db123456
    platform.nebula-cloud.io/Storage-pool: db123456-un6g
  provider: aws
  region: us-east-2
  zoneIndex: 0

Specification adjustments

Choosing the appropriate instance size is a more challenging task than expected for both users and service providers. Users frequently struggle to determine the optimal instance size for their needs. For example, as a user's data grows from 100GB to 1TB, they may be uncertain about the required number of CPU cores or the amount of memory necessary at each stage. It is up to the user and the SA to collaborate and ascertain these requirements.
When using fully managed cloud services, we provide an instance specification tuning function, which supports individual tuning of Query nodes or Storage nodes to meet the needs of users in terms of cost control and performance standards. Users can determine whether a single-node processing bottleneck has been reached based on the historical curve of the monitoring panel, and the cloud platform will also issue an internal alert.

Horizontal expansion and contraction

Horizontal scale-up and scale-down supports individual tuning of the Query node, which can be regarded as a stateless service. This node can be scaled up or down to manage fluctuations in query traffic. Similarly, the Storage node, responsible for storing graph data and providing read and write capabilities, can also be scaled, with several data slices distributed across each Storage node. Each instance of expansion or contraction necessitates data migration and data slice rebalancing, thereby requiring meticulous operation.
Before proceeding with node expansion, it is essential to prepare the corresponding computing resources. The database controller will send the current desired number of nodes to the infra-controller to trigger the NodeGroup expansion of EKS. Here we do not enable CA or Karpenter for the following reasons:
●Scale-up/down decisions depend on specific cluster conditions and resource requests, and if these conditions are not met, it may not perform the intended scale-up/down operation
●Cross-availability scenarios do not guarantee the expansion and contraction of a given region
●The Pending Pod-based model is not tightly integrated with our business systems.

Automatic elastic expansion and contraction

Supporting different workloads through automatic scaling up/down is a critical goal for us. Since computing and storage are separated, we can increase or decrease CPU and memory resources based on the utilization of each service node.
The Elastic Scaling Service architecture is as follows:
Every 10 minutes, the Elastic Scaling Service calculates the amount of resources that should be allocated to a database instance based on historical and current values of the monitored data and ultimately determines whether compute resources should be increased or scaled down. It can drive both horizontal scaling and vertical scaling (adjusting instance specifications) to ensure that business peaks are popped ahead of time. We will continue to make continuous improvements on this foundation to shorten the time to scale up and scale down by incorporating more metrics and proactive prediction strategies.

Design for Efficiency Improvement

Network security access

Currently, we offer two types of network connections to our users: network whitelisting and private link private connections. To ensure network security and simplify configuration steps, we introduced a proxy service, ngproxy, between the database instances and the Network Load Balancing (NLB), which takes full advantage of the functionality provided by the NLB.
The NLB is capable of setting a PPv2 header on every transmitted packet. ngproxy parses these packets to verify whether the user's endpoint ID matches the ID configured in the console, with any mismatched requests being rejected. The same principle applies to the public network. The same principle applies to the whitelisting of public network access. Only if the source address matches the address in the whitelist will it be allowed to pass.

Backup and recovery

Data backup serves as the primary defense against data loss, whether due to hardware failure, software error, data corruption, or human error, ensuring the safety of your data.
We provide users with both manual and scheduled backups and upload the backup data to AWS's object storage S3. We do not use a backup recovery solution based on cloud storage snapshots, as SST-based backup recovery is more rapid and does not depend on the cloud vendor's underlying services. Note that DDL and DML statements will be blocked during data backup, so it is recommended to perform backup operations during low business peak periods.
Backup data catalog structure:

backup_root_url/
- BACKUP_2024_08_20_16_31_43
├── BACKUP_2024_08_20_16_31_43.meta
├── data
│   └── 10.0.0.12:9779
│       └── data0
│           └── 5
│               ├── data
│               │   ├── 000009.sst
│               │   ├── 000011.sst
│               │   ├── 000013.sst
│               │   ├── CURRENT
│               │   ├── MANIFEST-000004
│               │   └── OPTIONS-000007
│               └── wal
│                   ├── 1
│                   │   ├── 0000000000000000001.wal
│                   │   └── 0000000000000000005.wal
.....
│                   ├── 30
│                   │   ├── 0000000000000000001.wal
│                   │   ├── 0000000000000000004.wal
│                   │   └── 0000000000000000005.wal
└── meta
    ├── __disk_parts__.sst
    ├── __edges__.sst
    ├── __id__.sst
    ├── __indexes__.sst
    ├── __index__.sst
    ├── __last_update_time__.sst
    ├── __local_id__.sst
    ├── __parts__.sst
    ├── __roles__.sst
    ├── __spaces__.sst
    └── __tags__.sst
- BACKUP_2024_08_20_19_02_35

Recovering data to a new instance of the program, so that you can take full advantage of the elastic expansion of resources on the cloud, specify a point in time after the backup set to quickly pull up a new database instance, and the user can be verified that the new database instance is working properly after the release of the old instance.

High availability

For those who demand the highest level of service reliability, we offer a cross-availability zone disaster recovery solution. In NebulaGraph, a zone is a collection of service nodes that divide multiple Storage nodes into manageable logical zones to achieve resource isolation. At the same time, the platform service controls the access of Query nodes to replica data within the designated Zone, thus reducing network latency and traffic costs incurred by cross-availability zone communication.
One of the main challenges in managing stateful services across availability zones is the issue of zone affinity of storage volumes. To address this issue, we have taken the approach of placing a NodeGroup in each availability zone. This avoids the possibility that a newly expanded Node may be incorrectly placed in an incorrect availability zone, resulting in the Pod not being scheduled.
Drawing from past user experiences, we also support the solution of expanding capacity individually in a certain availability zone. The cloud-native community's HPA-like solution only solves the problem of the quantity after expansion, but it cannot be expanded to a specified region. Scaling according to the business traffic carried by each region can effectively improve the quality of service. If a region observes a sudden surge in QPS, the elastic scaling service of the data plane will allow the scaling of Query nodes in this region.

Practical exercises on the cloud

Welcome to https://cloud.nebula-graph.io After initiating a subscription and creating a database, the next step demonstrates several ways to access the database:

Visualization tools

Select an instance from the Database list, go to Data->Graph graph space management, and click Explorer Graph to enter the visualization application.

Clicking on the Console in the upper right corner opens the console and allows you to execute queries for GQL statements.

network whitelisting

On the Overview page of the database instance, click Connect and select the Public method.

When network whitelisting is not configured, the database instance cannot be accessed directly from the public network.

$ ./nebula-console -addr nebula-graph-ncqqo19eh0aus73e4un6g.aws.cloud.nebula-graph.io -port 9669 -u dbaas-test@vesoft.com -p $PASSWORD -enable_ssl
Welcome!
(dbaas-test@vesoft.com@nebula) [(none)]> show spaces;
+------------------------+
| Name                   |
+------------------------+
| "genshin"              |
| "hanxiaotest"          |
| "movie_recommendation" |
| "nba"                  |
+------------------------+
Got 4 rows (time spent 602µs/222.883326ms)
Mon, 02 Sep 2024 16:52:39 CST

Privatelink private network connection

Follow the configuration steps for Create Private Link Endpoint to determine the VPC ID and Subnet ID where the service resides.
Run the command to create the Endpoint:

aws ec2 create-vpc-endpoint --vpc-id <YOUR-VPC-ID>  --region us-east-2 --service-name com.amazonaws.vpce.us-east-2.vpce-svc-04f25889c855f891b --vpc-endpoint-type Interface --subnet-ids  <YOUR-SUBNET-IDs>

Log in to the AWS console to view Endpoint status

Test access within the business VPC

Next, you can maintain the data on the cloud through official eco-tools according to your business needs.

Summary and outlook

This paper describes the architectural details of landing a fully managed cloud service for NebulaGraph on AWS based on cloud-native concepts, focusing on how AWS offerings can yield substantial benefits to users, and provide a new solution that combines cost and elasticity advantages for using NebulaGraph in the cloud.
While ensuring the quality of our services, we will persist in exploring possibilities to reduce user costs and increase ease of use. Our goal is for users to pay only for the resources they use, implementing the concept of cost in every detail.
Beta trials are currently underway, with a maximum spend of $1 per month, and you can subscribe via AWS Marketplace ( https://aws.amazon.com/marketplace/pp/prodview-wctboqxrwqkjy?sr=0-1&ref=beagle&applicationId=AWSMPContessa ) or visit https://cloud.nebula-graph.io Sign up for an account
If you are interested in NebulaGraph Cloud, please join the group to unlock more DBaaS-related content. https://join.slack.com/t/nebulagraph/shared_invite/zt-2k1eawak2-00D7XxOIZr8mDt0R4JfsSw

Practical Tips for Choosing the Right AWS EC2 for your Workload

YUE GUO — Fri, 12 Jan 2024 02:21:54 +0000

Background

AWS EC2, the Elastic Compute Cloud service from Amazon Web Services, offers developers user-friendly and flexible virtual machines. As another most established service within AWS, alongside S3, EC2 has a rich history dating back to its inception in 2006. Over nearly 17 years, it has continuously evolved, underlining its significance and reliability in the cloud computing space.

Many people new to AWS EC2 might have similar feelings:

There are too many types of AWS EC2 (hundreds)! Which one should I choose to meet my business needs without exceeding the budget?

If the CPU and Memory configurations of EC2 are the same, does it mean their performance differences are also the same?

What is the most cost-effective EC2 payment mode?

Reflecting back on the initial launch of EC2, there were only two types of instance available. Fast forward to today, and the landscape has dramatically expanded to an impressive 781 different types. This vast selection of EC2 options presents developers with a wide array of choices, potentially leading to a challenging decision-making process.

This article will briefly introduce some tips for selecting EC2 instances to help readers choose the right EC2 type more smoothly.

Model Classification and Selection

Meet the EC2 family

Although AWS has hundreds of EC2 types, there are only a few major categories as listed follow:

General Purpose, M and T series: provide a balance of CPU, memory, and network resources, sufficient for most scenarios;
Compute Optimized, C series: Suitable for compute-intensive services;
Memory Optimized, R and X series: Designed to provide high performance for workloads processing large data sets;
Accelerated Computing: Accelerate the compute instances and use hardware accelerators or coprocessors to execute functions such as floating-point calculations, graphics processing, or data pattern matching, which are more efficient than software running on CPUs;
Storage Optimized: Designed for workloads that require high-speed, continuous read and write access to very large data sets on local storage;
HPC Optimized, HPC series: A new category by AWS mainly suitable for applications that require high-performance processing, such as large complex simulations and deep learning workloads;

Typically, each specific EC2 type belongs to a Family with a corresponding numerical sequence. For example, for the General Purpose type M series:

M7g / M7i / M7i-flex / M7a
M6g / M6i / M6in / M6a
M5 / M5n / M5zn / M5a
M4

The numerical sequence reveals that M7 represents the latest generation, whereas M4 is comparatively older. Generally, a higher number indicates a more recent model and CPU type, and often, the pricing is more favorable due to the natural depreciation of hardware.

Key Parameters

We can extract the following key parameters from the AWS EC2 model introduction.

Specific Model of EC2: Generally named as <family>.<size>, like m7g.large / m7g.xlarge. For EC2, a certain model is unique globally;
CPU and Memory Size: The number of vCPUs and the size of Memory. Most EC2 models have a 1:4 ratio, i.e., the ratio of the number of vCPUs to Memory. For example, when there is 1 vCPU, Memory is usually 4GiB; when there are 2 vCPUs, Memory is usually 8GiB.
Instance Storage: EC2 can generally mount different types of persistent storage disks, mainly:

EBS: Mounted AWS distributed block storage service, which is usually the default choice for most EC2 models. Some models only have the option to use EBS, which is bound to a specific AZ. Although its read/write latency is higher than local SSD, it's acceptable in most scenarios. EBS also has different types based on parameters like IOPS and throughput, such as:
- gp2/gp3: Underlying general-purpose SSD, with gp3 being officially recommended for better cost-performance. Typically, the default setting is 3000 IOPS, but it also offers the flexibility to increase IOPS on demand, without any downtime—though this does come with additional costs;
- io1/io2: Stronger performance and higher price, also supporting features like Multi Attach (usually, other types of EBS can only be mounted on one EC2);
Local Storage: Some models support local storage in addition to mounting EBS, but of course, they are more expensive. Generally, these models will have a d in their model name. For example, m7g.large is an EBS-Only model, while m7gd.large has 1 118GiB NVME SSD local storage. Some special models also support larger local HDDs;

EBS Bandwidth: For some newer and specifically EBS-optimized EC2 models, AWS equips them with dedicated EBS bandwidth. This means that in high data throughput scenarios, EBS-optimized models can always enjoy better throughput without competing for network bandwidth on the local machine;
Network Bandwidth: The network bandwidth corresponding to the EC2 model;
CPU Model: In most scenarios, we can see CPUs from the following manufacturers:

AWS's self-developed Graviton processor based on the ARM architecture (currently up to Graviton 3), such as the M7g series;
Intel x86-64 architecture CPU;
AMD x86-64 architecture CPU;

Generally, for similar configurations, the pricing trend is Intel being the most expensive, followed by AMD, and then Graviton, with the performance ranking inversely. For general scenarios that are performance-insensitive, users can consider using ARM architecture models, which offer greater cost-effectiveness.

AWS is one of the earliest cloud vendors to introduce ARM architecture into the server CPU field. After years of R&D, Graviton CPU has made significant progress and has a great competitive advantage in cost-performance. It is expected that more customers will use Graviton CPU models in the future.

Virtualization Technology: Various EC2 models employ distinct virtualization technologies, resulting in differences in their technical parameters. For example, for newer EC2 models, Nitro virtualization technology is generally applied. Nitro is AWS's latest virtualization technology, offloading many virtualization behaviors to hardware, making the software relatively lighter and virtualization performance stronger. From the user's perspective, identical configurations will yield enhanced performance due to reduced virtualization overhead.
Whether suitable for Machine Learning Scenarios: With the development of LLM technology, more and more vendors will choose to train their models in the cloud. If you want to use model training on AWS EC2, Accelerated Computing generally would be your choice, such as:

P series and G series models: They use Nvidia's GPU chips. At the re:Invent 2023 conference, Nvidia and AWS started a deeper strategic cooperation. AWS plans to use Nvidia's latest and most powerful GPUs to create a computing platform specifically for generative AI;
Trn and Inf series: In addition to using Nvidia GPUs, AWS also develops its own chips for machine learning, such as the Trainium chip for training and the Inferentia chip for model inference. Trn series and Inf series EC2 models correspond to these two AWS-developed machine learning chips respectively;

Key Takeaways

Building on the overview provided above (and there's much more to explore about EC2), we've compiled a few tips for users to consider when selecting EC2 instances.

Typically, for most EC2 models, a higher sequence number indicates a newer CPU model. This generally means better performance and, interestingly, a more cost-effective pricing structure – essentially, you get more bang for your buck.
Among the general-purpose EC2 models, the T series is relatively cheap and offers a Burstable CPU feature: the instance accumulates CPU credits while operating under baseline performance, and when encountering high load scenarios above baseline performance, it can run beyond baseline performance for a certain time according to CPU credits (without changing the cost). However, this also means the T series won't have very high performance, with generally low bandwidth and no EBS optimization. Therefore, the T series is more suitable for non-performance-verified test environments;
Within the general-purpose series, if you're aiming for cost-efficiency, it's advisable to prioritize AWS ARM architecture models;
AWS's official EC2 Pricing on its website is very difficult to read, it is recommended to use Vantage to check price information (it is also an open-source project);
For most cloud users, the cost of EC2 is generally their major expense. Here are a few ways to reduce this cost as much as possible:

Fully utilize the elasticity of the cloud:
make your architecture as flexible as possible, and use on-demand computing power. You can use AWS's Karpenter or Cluster Autoscaler to make your EC2 flexible and scalable;
Use Spot instances:
Spot instances can be 30% to 90% cheaper than On-Demand instances, but they are subject to preemption and can't be relied on for long-term stable operation. AWS will notify you 2 minutes before preemption, then proceed with it. Spot instances, if well-managed at the underlying level, are very suitable for elastic computing and interruption-tolerant scenarios. For example, the SkyPilot project uses different cloud Spot instances for machine learning training;
Optimize payment modes:
Beyond technical approaches, cost reduction can also be achieved by purchasing Saving Plans. These plans offer lower unit costs compared to On-Demand pricing, though they come with decreased flexibility. This makes them more suited for scenarios with relatively stable business architectures.

Conclusion

Efficient selection and utilization of EC2 should be tailored to the user's unique scenarios, requiring continuous and iterative optimization. In summary, leveraging the cloud's elasticity and understanding the key parameters of various EC2 models is essential for every AWS user.

From 0 to Limitless, the 15-year odyssey of Amazon relational database product line

YUE GUO — Thu, 30 Nov 2023 11:31:15 +0000

From 0 to Limitless, the 15-year odyssey of Amazon relational database product line
It’s another annual Amazon re:Invent, and the most important release of the relational database product line is Amazon Aurora Limitless Database. In the Keynote of AWS Senior Vice President Peter DeSantis, he also spent nearly half of the time going over the history of the Amazon relational database.

2009 - RDS

O to 1, lifting the vanilla MySQL and PostgreSQL to the cloud.

2014 - Aurora

Unveiling Aurora, rebuild the storage engine based on the internal log architecture (code name Grover). Aurora greatly improves the performance and availability while maintaining the full MySQL and PostgreSQL compatibility.

2018 - Aurora Serverless

Introducing Aurora Serverless, leveraging the database-optimized virtualization technology (codenamed Caspian), to offer the seamless scale up/down.

2023 - Aurora Limitless

The new Aurora Limitless, the scale-out distributed database by implementing the ultra-low latency clock synchronization.

Architecture-wise, Aurora Limitless resembles Google Spanner, both are distributed databases (NewSQL). The most difficult point of a distributed database is to implement high-performance distributed transactions. Aurora Limitless also adopts a solution similar to Spanner's TrueTime. I expect AWS to reveal details about its compatibility with native PostgreSQL and the performance benchmark shortly.

Speaking of database compatibility, this time Aurora Limitless launches PostgreSQL support first instead of MySQL. I have two hypotheses. One is that PostgreSQL codebase is easier to adapt to the Aurora Limitless architecture. Limitless requires a Router component that parses SQL, and the PostgreSQL server layer code is easier to work with; the other is that the PostgreSQL adoption has caught up with MySQL.

Summary

	Cloud	Singe-node Enhancement	Scale up/down Elasticity	Scale-out Infinity	Technology Breakthrough 🚀
2009 - RDS	✅				General virtualization
2014 - Aurora	✅	✅			Log-based architecture (Grover)
2018 - Aurora Serverless	✅	✅	✅		Database-optimized virtualization (Caspian)
2023 - Aurora Limitless	✅	✅	✅	✅	Distributed clock synchronization

After 15 years of iteration and 4 technological breakthroughs corresponding to 4 product generations, AWS relational database has reached its current form of Aurora Limitless. Speaking of the database core, Aurora Limitless is quite complete. What's left are the database development workflow challenges:

How to reduce the downtime of schema changes on large tables.
How to make database cloning instantaneous for development and testing.
How to make database development workflow like code flow and integrate with the overall CI/CD pipeline.

0 to Limitless completed, next from Limitless to Flawless. Go Aurora.

Rich Bowen: "The Ultimate Product is Trust"

YUE GUO — Thu, 14 Sep 2023 11:27:38 +0000

Rich Bowen has been involved in open source since before we started calling it. As a member of the Apache Software Foundation, Rich serves as a board member, VP Conferences and also an open-source strategist at AWS. Rich's multiple roles have granted him a more diverse and profound understanding of open source.

After he delivered his keynote speech, "Talking with Management about Open Source," at CommunityOverCode Asia 2023, we had a quick chat to explore more behind the speech. Besides management and open source, we also discussed approaches and strategies of AWS and the Apache Software Foundation (ASF) in open-source projects, what matters most for building open-source ecosystems, and how Rich manages to balance different roles, etc.

SegmentFault: In recent years, many Chinese companies begin to establish their open-source program office. As a well-known open-source company, how does AWS manage, operate and promote open source?

Rich Bowen: In general, AWS and Amazon have been building on open source from the very beginning. Everything that we have done has relied on open source. At Amazon, we have what we call our leadership principles. There are some things that guide the way we think.

The first one is customer obsession. We're always concerned about the customer and serving their needs foremost, so we've built all of this infrastructure, and all of these products are built on top of open source. The most important thing is that open-source projects are sustainable. Everything that we do around open source is focused on making sure that those open-source projects remain healthy. One of the things that we do is we try to choose projects and if there are several projects to choose, we try to pick one that has a healthy ecosystem and a healthy community. That can mean a lot, including the involvement of many companiesIt and transparent conversations. Then, we watch those communities closely to make sure they stay healthy, and we also attempt to participate actively to keep them healthy, and this summarizes our most critical and biggest focus.

The way that we set out to promote open source is to do it with the community.  It's not like doing a promotion, instead we try to do it with the community. For example, we rely on the Apache projects so much, like Kafka and Airflow, etc. We promote them by being involved in their conferences, such as Kafka Summit and Airflow Summit. Instead of trying to do our own independent promotion, we do it with the community. In this way, we ensure the voice is from the community rather than the voice of Amazon. We also get involved in community events, like KubeCon or open-source summits.

SegmentFault: It's more about making the company step back, but for some open-source companies invested in the project a lot, making clear decisions can be difficult. They might blur the lines due to investment or treat the project as their brand.

Rich Bowen: That's correct. Occasionally, you'll see the promotion of an open-source project, and you'll wonder, are they talking about the project or are they talking about the company? It's hard to tell.

Like any big company, there's going to be a difference between one department and another. But my job, as an open-source strategist, is to engage with those departments and advise them on what we believe to be the correct way to engage, which is to put the community first and not us.

SegmentFault: Could you please share with us some astonishing AWS open-source projects?

Rich Bowen: There are two main types of open-source communities that Amazon is involved in. From my perspective. There are ones that are primarily Amazon, and there are ones that are primarily community.

Some open-source projects that we're involved with are focused around one of our services, and so there's no real incentive for other people to get engaged in it unless they're customers. There are ones like Apache Kafka, where there are many companies involved. I'm primarily interested in that second kind, the ones that are real community projects. At Amazon, most of the projects that I'm involved with are, in fact, Apache projects.

One of the projects that I'm most excited about is Apache Airflow because it's a project where Amazon is very involved. We've got plenty of full-time engineers who are working on it, but the project is not owned by Amazon. It's a community project, and there are many companies involved. That is one of the models for the best way that we engage in projects from my perspective.

SegmentFault: From your perspective, how to build an open-source ecosystem?

Rich Bowen: Transparency is the most important thing. All the discussions regarding Apache Airflow take place on the Apache Airflow mailing list. We don't have an internal meeting and then go to the project saying “Here's what we've decided”, instead, we make a proposal to the community, and it's discussed and decided in the community. So transparency and working in public is the first part of that.

The second part of that is listening with humility. It's not like I have the answers, and you should agree with me. It's listening to what the entire community thinks and then making the decision together. One of the early thinkers in open source was Bill Joy, and he founded Sun Microsystems. One of the things that he said, and I always think about is, no matter what company you work for, the smartest people in the world work somewhere else. So, thinking your team has all the answers is arrogant and short-sighted. You should listen because the best ideas always come from somewhere else because they make you think in a new way.

So, listening is the biggest part there. You have to earn trust from the community because it’s hard to gain trust, but it's so easy to lose trust. You have to be respectful, listen politely and calmly, and contribute your ideas, but don't try to force people to see your perspective.

SegmentFault: Now that you work both for AWS and the Apache Software Foundation, do you find there are some similarities and differences?

Rich Bowen: Yeah, the most significant difference is obviously the motivation. Companies exist to serve customers and make money. That's what we're there for. The Apache community exists to produce free software for the public good.

But from there, I see a lot of overlap because as I mentioned earlier, customer obsession is our top leadership principle, and that's the same thing with an open-source project. If you're not focused on the user, then you're missing the point.

As for a company to be successful, you have to think of all the users as your customers, whether they're paying you or not. They might someday, maybe they won't. But you need to make sure that you're developing products that people want, so that's the same thing with open source.

One more thing that remains constant between a prosperous company and a thriving open-source community is that no matter what you're creating – be it a car, a service, or software – the ultimate product is trust.

Do your customers trust you? Because if they don't trust you, no matter how good your product is, they're going to go somewhere else, which is the same in open-source communities. Trust is always the most important thing. If your open-source community burns the trust of the users, no matter how good your product is, they're not going to use it. We deal with open-source projects weekly, where they will decide without consulting the community. Then everybody goes to some other product just overnight, and trust is the biggest thing of that.

SegmentFault: From the foundation or community perspective, is there any conflict when you present different roles? Let’s say, having the feeling of it's the wrong way to go?

Rich Bowen: Yeah. What's important about when you have a conflict of interest is to be honest about it, to be transparent.

For example, there are projects that my company relies on very heavily, and I may want the project to make a certain decision. Or one of our competitors is involved in the project and I may not want them to be successful. So it's critical for me to say when I'm having these conversations, by the way, I work for Amazon, these are our interests, this is our potential conflict of interests. Then, once you have disclosed that, you would try to put the community first.

This goes back to an earlier point. It's significant that all of your involvement with the projects is transparent and upstream first. Because if you're making decisions internally and then taking them to the project, you're not putting the project's interests first. But if you're focused on your customers and your users, then you want the project to be successful. There's a phrase in English, the tide lifts all boats, which means if I make the project successful, I'm going to help my competitors and that's okay. Because the tide lifts all of us, and we collaborate on this thing. Maybe I'm helping my competitors, but mostly I'm helping my customers.

SegmentFault: What aspects of open source should management be aware of, and why are these aspects particularly critical?

Rich Bowen: Okay. So I think that one of the most critical things for management to understand is that open source is part of the supply chain.

No matter what your product is, you rely on raw materials, natural resources coming from somewhere else. If you just consume, then at some point you're going to run out of it. If we think about open source as our raw materials, then we have an obligation to make sure that that source of raw materials continues to be healthy. So if you are a carpenter, you want to make sure there is always a forest, you plant new trees rather than just cut it down. So what I try to communicate to management is that sustainability is our job. It is our job to make sure that projects we rely on remain healthy.

Now, historically, there are many companies that have built businesses on top of open source without contributing back, which lead to several situations. One is that the project will resent you and find ways to oppose you. You end up in a situation where you've built a product on top of something that maybe they changed their license to cut you out of the picture.

The other is that if you don't actively participate, then you don't have a voice in the decisions that are made. Perhaps the project will go in a direction that doesn't benefit you. And so active participation in your supply chain is critical. And this is true. It's not just true in software, it's true in any business. If you rely on coal, then you should understand how the coal business works so that you can see months or years in advance when a problem is coming. That is the most important thing to me.

The last thing is trust, which is again tied to transparency. You need to make sure that you are in open communication with the project, and you don't make decisions, you don't do your marketing in ways that annoy the community or ways that embarrass or devalue the community. You need to do it in ways that give credit, give the correct credit to the people who are doing the work.

SegmentFault: You mentioned consuming and applying open-source softwares which brings up another topic: contribution. What challenges do management teams face when using, contributing to, or applying open-source software? How can these challenges be addressed?

Rich Bowen: I think the biggest difficulty that businesses have in working with open source, and I mentioned in our round table yesterday, is being patient. When you try to make a decision in an open-source project, you can't just have your manager say, ‘This is how we're going to do it’. You have to discuss it.

Occasionally, you have to wait days or weeks to arrive at a decision, and that is very frustrating for companies that want to launch a product. As I have a product launch deadline and an upcoming conference where I need to announce a new feature, but you can’t do that with open source. That makes it a challenge to communicate to management why they should wait.

The reason that they should wait is that we are customer obsessed, and the users are our customers. We need to make sure that we have time, and we have to be patient to hear back from the customers. So if I have this great idea for a new feature, I'm excited about it. I want to launch it, but the users don't think it's a good idea, then eventually it will fail and so it is worth waiting for. You can even consider it to be market research, it's worth waiting to hear back from the user community. The developers involved in a project are representative of themselves, and they're the most knowledgeable users, also the power users. We have to make sure that we include them in that decision process, and that's hard to communicate to managers about deadlines.

SegmentFault: So there’s a natural conflict between management, which represents the company, and the open source. How can you convince management that understanding open-source concepts benefits in aligning strategies to meet business goals?

Rich Bowen: Going back to my earlier question, I think that having them understand that this is their source of materials and their supply chain. I find that kind of metaphor is the most useful: if we're selling milk, we need not kill the cow.

Now, some people will say, ‘We should just buy the cow, own it and ignore the farmers. We should just do this ourselves’. It’s important for a business to understand what they’re uniquely good at and things to leave behind. So you collaborate on the things that are common, and then concentrate as a company on the things that you're uniquely good at.

For example, AWS is entirely about and good at hosting network services. We have data centers all over the world. We have enormous scale, fast networks, and talented system administrators. But the software that we run is developed by the entire world, the entire community, and we share that. Then, we focus on what we're uniquely good at.

I spent nine years at Red Hat. Red Hat is not a software company. Red Hat sells support instead of software. What I would advise salespeople is not to emphasize being the best software producers in the world, as the software is made communally. What we're good at is selling support. That was the message that I thought a lot about in my Red Hat years, which is very much the same thing at AWS.

Learn more about Rich's thoughts and insight on open source in his blog at DrBacchus.com.

Author: Anne Zhu
Anne Zhu is the community manager of Answer, SegmentFault.

Creating a Quantum Computing Portfolio Segment Using Variational Quantum Eigensolver

YUE GUO — Wed, 13 Sep 2023 12:06:56 +0000

In the field of quantum computing, Qudoor started the research and development of distributed ion trap quantum computer in 2020, and the research and development of ion trap chip, precision laser system, qubit photoelectric measurement and control system, high-speed electronic timing and control system, quantum programming language, quantum cloud, quantum computing library, quantum application and other systems is also continuing. In 2021, it launched China's first ion trap quantum computer engineering machine AbaQ-1, laying a solid foundation for the 100-bit distributed ion trap quantum computer with a quantum volume of more than 100 million planned to be completed in 2023. In terms of quantum computing applications, it has established cooperative relations with domestic leading enterprises in the fields of insurance, securities, new drug research and development, encryption and decryption. To realize the deep binding of industry applications and build a market ecology.

Disclaimer:

The information provided in this article is for informational purposes only and should not be considered as financial advice. The author is not a licensed financial advisor. Readers should consult with a qualified professional before making any investment decisions. Investing involves risk, and past performance is not indicative of future results. The author assume no responsibility for any financial losses resulting from the use of this information.

The rapid advancement of quantum computing presents enticing possibilities and potential applications across a range of industries. As investors explore opportunities in this emerging sector, the development of a well-optimized investment portfolio segment becomes indispensable. This article delves into the process of constructing a quantum computing investment portfolio segment using variational quantum eigensolver algorithm. While the majority of the article maintains a non-technical approach, I have dedicated the final section to a closer examination of the underlying code used to create this portfolio segment.

Metric Research:

In finance, sigma (σ) and mu (μ) are often used to represent standard deviations and means in statistics, as described in the previous answer. However, in quantum computing and quantum wiring, sigma and mu no longer represent the same concepts, but are related to quantum states and quantum gate operations. The following explains what they mean in quantum computing and why the expected value of a quantum circuit can be calculated from these two pieces of data.

Sigma (σ) : In quantum computing, σ is often used to represent a form of the Pauli matrix, for example, σ_x, σ_y, and σ_z. These matrices are fundamental operators in quantum computing that describe the rotation and measurement of quantum bits (qubits). The σ matrix is related to operations and measurements in the quantum circuit, not the dispersion of the data
Mu (μ) : In quantum computing, μ is not usually used to represent a specific concept. It is not directly related to the expected value in quantum computing, but is used to represent the complex amplitudes of quantum states. A quantum state is a state that describes a qubit, which is a linear combination that includes the amplitudes of different ground states.

The expected value calculation of quantum circuits is usually related to the Pauli matrix and the inner product (also known as the expected value) of quantum states. The expected value can be expressed as follows:
E(O)= <Ψ∣O∣Ψ>
Where E(O) represents the expected value of the operator O, |Ψ⟩ represents the quantum state, and O represents a Pauli matrix or other quantum operator. This expected value represents the average measurement obtained by applying the operator O for a given quantum state

The Algorithm:

The VQE (Variational Quantum Eigensolver) algorithm is an approximate solution method based on quantum computing, commonly used to solve problems in quantum chemistry and materials science. Although it is not specifically designed for the financial sector, it can use the power of quantum computing to solve portfolio optimization problems in the financial sector.

First, the portfolio optimization problem is transformed into a problem in quantum computing. In finance, the goal of portfolio questions is usually to maximize returns or minimize risks, while taking into account the weighting of various assets.

The Hamiltonian is a quantum mechanical concept that describes the total energy of a system and can be used to represent portfolio problems. By constructing a Hamiltonian, the goal of the portfolio problem is transformed into a problem of finding the smallest eigenvalue (ground state energy), where the eigenvalue corresponds to the target value of the optimal portfolio.

In order to simulate the Hamiltonian, an appropriate quantum circuit structure is selected. This circuit usually consists of a series of quantum gate operations that can adjust the parameters in the circuit to optimize the portfolio problem.

Use a quantum computer or simulator to run the VQE algorithm. The VQE algorithm uses a classical computer to continuously adjust the circuit parameters to approximate the minimum eigenvalue of the Hamiltonian as closely as possible. This process may require several iterations until satisfactory accuracy is reached or an optimal solution is found.

Once the VQE algorithm finds the minimum eigenvalue, it can decode the corresponding quantum states and parameters. These parameters can be translated into asset weights in the financial portfolio, so as to obtain the optimal portfolio allocation.

Finally, the performance of the obtained portfolio allocation is evaluated, including return, risk and other indicators. If needed, the configuration can be further optimized to meet specific investment objectives and constraints.

Amazon Braket:

Amazon Braket is a fully managed quantum computing service designed to help accelerate scientific research and software development in quantum computing. Users can develop quantum programs using Amazon Braket SDK through local Jupyter Notebook/IDE or Amazon console, and call Amazon Cloud Technology to provide quantum hardware or simulation resources to run quantum computing programs.

Below is how configure to use Ionq on AWS:

Click “Security credentials”, create access keys and keep the status “Active”

Before using Boto3, you need to set up authentication credentials for your AWS account using either the IAM Console or the AWS CLI. You can either choose an existing user or create a new one. For instructions about how to create a user using the IAM Console, see Creating IAM users. Once the user has been created, see Managing access keys to learn how to create and retrieve the keys used to authenticate the user. If you have the AWS CLI installed, then you can use the aws configure command to configure your credentials file: aws configure Alternatively, you can create the credentials file yourself. By default, its location is ~/.aws/credentials. At a minimum, the credentials file should specify the access key and secret access key. In this example, the key and secret key for the account are specified in the default profile: [default] aws_access_key_id = YOUR_ACCESS_KEY aws_secret_access_key = YOUR_SECRET_KEY You may also want to add a default region to the AWS configuration file, which is located by default at ~/.aws/config: [default] region=us-east-1 Alternatively, you can pass a region_name when creating clients and resources. You have now configured credentials for the default profile as well as a default region to use when creating connections. See Configuration for in-depth configuration sources and options.
After run a circuit on aws device, below page will display all Quantum Tasks, click anyone

Click “See results in XX”, you will see the results of the corresponding tasks

Results.json is the result json file of the task, you can click to download

The Code:

This section is designed for the technical reader which might have an interest in the code behind the creation of our portfolio segment. The first step of the analysis was to collect the relevant stock data. The sample raw data can be seen as follow:

I then aggerated the data using the following code:

The resulting aggregated data used for the optimization can be seen:

Below is the code for the variational quantum eigensolver:

The result:

Based on the optimization process, the final portfolio allocation is as follows:

About US：

Qudoor Quantum Technology (Beijing) Co., LTD. (hereinafter referred to as Quanta) was officially initiated by overseas returnees in January 2019, headquartered in Beijing Zhongguancun Software Park, focusing on quantum communication equipment manufacturing and quantum computer full-stack development, relying on the technical team's 20 years of superior technology accumulation and rich product experience in the field of quantum information technology. Become the first domestic scientific and technological innovative enterprise with both quantum computing and quantum communication core technology reserves and product research and development capabilities, and will design and build a new generation of quantum Internet in the future.

作者：
Keith Yan（丘秉宜）中国首位亚马逊云科技 Community Hero

Liu li（刘利）启科量子 C++工程师

Cheng Shuang（程骦）启科量子 Python工程师

Huang zonggui（黄宗贵）启科量子前端开发工程师

Push or Pull, is this a question?

YUE GUO — Wed, 09 Aug 2023 09:57:49 +0000

Author Yan Wenze

Senior Development Engineer of MatrixOrigin

The SQL execution engine of the database is responsible for processing and executing SQL requests. Generally, the query optimizer outputs a physical execution plan, which is usually composed of a series of operators. To ensure efficient execution, operators need to be composed into a pipeline.
There are two ways to build a pipeline: the first is the demand-driven pipeline, where an operator continuously pulls the next data tuple from the downstream operator; the second is the demand-driven pipeline, where an operator pushes each data tuple to the next operator. So, which type of pipeline construction is better? This may not be an easy question to answer. Snowflake’s paper mentions that push-based execution improves cache efficiency by removing control flow logic from data loops. It also allows Snowflake to effectively handle DAG plans of pipelines, creating additional opportunities for sharing and pipelining of intermediate results.

The following figure from reference [1] illustrates the difference between Push and Pull most directly:

In simple terms, the Pull pipeline is based on the iterator model, and the classic volcano model is based on Pull. The volcano model is a mature SQL execution solution in databases. This design pattern abstracts each operation in relational algebra as an operator. In this case, the entire SQL statement forms an operator tree (execution plan tree) in this case; by calling the next interface in a top-down manner, the volcano model can process data on a database row basis, as shown in the next() method in the figure. This request is recursively called until the leaf node of the query plan tree can access the data itself. Therefore, for the Pull model, this is very easy to understand and implement: each operator needs to implement the next() method, and the recursive call is made once the query plan tree is constructed.

The volcano model has the following characteristics:

Process data row by row, where each row of data is processed by invoking the next interface.
Invoking the next interface requires a virtual function mechanism. Virtual functions require more CPU instructions than direct function calls and are more expensive.
Processing data on a per-row basis leads to inefficient CPU cache utilization and unnecessary complexity. The database needs to keep track of which row is being processed in order to move to the next row. Additionally, after processing a row, the next row needs to be loaded into the CPU cache. However, the CPU cache can store more than just one row of data.
The most significant advantage of the volcano model is that the interface looks clean and easy to understand. Since data flow and control flow are combined, each operator has a clear abstraction. For example, a Filter operator only needs to focus on how to filter data based on predicates, an Aggregates operator only need to focus on how to aggregate data.

To reduce overhead, the Pull model can introduce vectorized acceleration by implementing the GetChunk() method to retrieve a batch of data instead of fetching one row at a time, using the Projection operator as an example:

void Projection::GetChunk(DataChunk &result) {
        // get the next chunk from the child        child->GetChunk(child_chunk);
        if (child_chunk.size() == 0) {
            return;
        }
        // execute expressions        executor.Execute(child_chunk, result);
 }

In this code snippet, there are some control flow-related lines that are coupled with the operator’s processing logic. Each operator implementation needs to include such code. For example, there is a need to check if child_chunk is empty because the child has performed filtering during the GetChunk operation. Therefore, the internal implementation of the Pull model’s interface can be redundant and prone to errors.

Unlike the iterator model in the Pull pipeline, the Push model has a reversed data flow and control flow. Specifically, instead of the destination Operator requesting data from the source Operator, data is pushed from the source Operator to the destination Operator by the source Operator passing data as parameters to the consumption method (Consume) of the destination Operator. Therefore, the Push pipeline model is equivalent to the Visitor model, where each Operator no longer provides next, but is replaced by Produce/Consume. The Push model was proposed by Hyper [3], called Pipeline Operator. Its original intention is that the iterator model is Operator-centric, the boundaries of Operators are too clear, resulting in data transfer (from CPU register to memory) between Operators generating additional memory bandwidth overhead, and unable to maximize Data Locality. Therefore, execution needs to switch from Operator-centric to data-centric, keeping ingdata in registers as long as possible to ensure maximum Data Locality. Furthermore, Hyper introduced the NUMA scheduling framework of the operating system into the query execution scheduling of databases [2], implementing parallelism-awareness for the Push model:

Use Pipeline to combine operators and perform bottom-up Push scheduling. When a task finishes execution, it will notify the scheduler to enqueue subsequent tasks. Each data block unit is called a Morsel, containing about 10,000 rows of data. The execution unit of a query task is to process one Morsel.

Prioritize scheduling subsequent tasks generated by a task on the same core to avoid inter-core data communication overhead.

When a core is idle, it has the ability to “steal” a task from other cores to execute (Work Stealing). Although this may sometimes increase data transfer overhead, it relieves the accumulation of tasks on busy cores, and overall accelerates task execution.

When a core is idle and able to steal work, the scheduler does not immediately satisfy the idle core’s request, but lets it wait for a while. During this time, if busy cores can complete their own tasks, then cross-core scheduling can be avoided.

Take multi-table Join as an example:

SELECT ... 
FROM SJOIN R USING A
JOIN T USING B;

This query consists of multiple Pipelines, Pipelines need to be parallel between each other, and also parallel inside the Pipeline. In practice, controlling parallelism only needs to happen at the endpoints of the Pipeline. For example, in the above diagram, intermediate operators like Filter do not need to consider parallelism themselves, because the source TableScan will Push data to it, and the Sink of the Pipeline is Hash Join, whose Hashtable Build phase needs to be parallelism-aware, but the Probe phase(stage?) does not need to be. Basing on Push to control the parallelism-awareness of Pipelines makes it technically easier.

While it is relatively easy to implement parallelism-awareness in the Push model, why is it not so easy in the Pull model? Since scheduling is top-down rather than data driven, a direct idea is to partition, and then the optimizer generates physical plans according to the partitioning, executing the physical plans of different partitions in parallel. This can easily lead to more complex query plans (introducing more partitions), and it is not easy to achieve automatic load balancing, specifically: when partitioning the input data, after some Operators (like Filter), the amount of data retained in different partitions can vary a lot, thus subsequent operators will face data skew problems. In addition,different CPUs may spend different amounts of time processing the same amount of data. Various factors such as environmental interference, task scheduling, blocking, errors, etc., can slow down or even terminate the processing, thereby reducing overall efficiency.

Hyper’s Push model was proposed in 2011. Before that, most SQL engines adopted the volcano Pull model based on iterators. It is known that systems built based on Push include Presto, Snowflake, Hyper, QuickStep, HANA, DuckDB (switched from Pull model to Push model in October 2021, see reference [4]), Starrocks, etc.

ClickHouse is an outlier, in its own Meetup materials, it claims to be a combination of Pull and Push, where the query uses the Pull model. Also, in its code, it uses the Pull term as well, as the core driver of query scheduling — PullingAsyncPipelineExecutor. After generating the QueryPlan (logical plan) from the AST, and applying some RBO optimizations, ClickHouse converts the QueryPlan into Pipelines in postorder traversal, which generates Pipelines very similar to the Push model, because each Operator (ClickHouse calls it Processor) in the Pipeline has inputs and outputs, the Operator Pulls data from input, processes it, and Pushes to the next Operator in the Pipeline. Therefore, ClickHouse is not a traditional volcano Pull model implementation, but generates Pipeline execution plans from the query plan tree. The method of generating Pipelines from the Plan Tree of the volcano Pull model is postorder traversal, starting from Nodes without Children to construct the first Pipeline, which is the standard approach to generate Pipeline Operators in the Push model:

QueryPipelinePtr QueryPlan::buildQueryPipeline(...){
    struct Frame
    {
        Node * node = {};
        QueryPipelines pipelines = {};
    };
    QueryPipelinePtr last_pipeline;
    std::stack<Frame> stack;
    stack.push(Frame{.node = root});
    while (!stack.empty())
    {
        auto & frame = stack.top();
        if (last_pipeline)
        {
            frame.pipelines.emplace_back(std::move(last_pipeline));
            last_pipeline = nullptr;
        }
        size_t next_child = frame.pipelines.size();
        if (next_child == frame.node->children.size())
        {
            last_pipeline = frame.node->step->updatePipeline(std::move(frame.pipelines), build_pipeline_settings);
            stack.pop();
        }
        else
            stack.push(Frame{.node = frame.node->children[next_child]});
    }
    return last_pipeline;}

Next is Pipeline scheduling. First PullingAsyncPipelineExecutor::pull pulls data from the Pipeline:

PullingAsyncPipelineExecutor executor(pipeline);
    Block block;
    while (executor.pull(block, ...))
    {
        if (isQueryCancelled())
        {
            executor.cancel();
            break;
        }
        if (block)
        {
            if (!state.io.null_format)
                sendData(block);
        }
        sendData({});
    }

When pull is called, a thread is selected from the thread_group, then data.executor->execute(num_threads) executes the PipelineExecutor, where num_threads indicates the number of parallel threads. Next, PipelineExecutor converts the Pipeline into an ExecutingGraph for physical scheduling and execution. The pipeline is a logical structure, it does not care about how to execute, while ExecutingGraph is the physical reference for scheduling and execution. ExecutingGraph converts the InputPort and OutputPort of Pipeline Operators into Edges, using Edges to connect 2 Operators, Operator is the Node of the graph. After that is PipelineExecutor::execute which schedules the Pipeline via the ExecutingGraph, the main functionality of this function is to schedule tasks by popping ExecutingGraph::Node execution plans from the task_queue. During scheduling, threads keep traversing the ExecutingGraph, scheduling execution based on Operator execution states, until all Operators reach the finished state. Scheduler initialization picks all Nodes in ExecutingGraph without OutPort to start, hence, control flow originates from the Sink Node of the Pipeline, recursively calling prepareProcessor.This differs from the Push model where the control flow starts from the Source Node and propagates level by level. Apart from the difference in the control flow direction, this Pipeline Operator is identical to Push, thus some people also categorize ClickHouse into the Push model, after all, in many literature contexts, Push is equivalent to Pipeline Operator, and Pull is equivalent to Volcano. The correspondence between Pipeline and ExecutingGraph is shown below (in ClickHouse, Operator=Processor=Transformer):

Therefore, the Push model is parallelism-aware, which essentially requires designing a scheduler that controls data flow and parallelism well. In addition to the aforementioned advantages, the naive Push model also has some disadvantages: handling Limit and Merge Join is difficult (see reference [1]), for the former, the Operator cannot easily control when the source Operator stopsproducing data, thus some elements may be produced but never used. For the latter, since the Merge Join Operator cannot know which source Operator generates the next data Tuple, Merge Join cannot be pipelined, so Pipeline Breaker is needed for at least one of the source Operators, requiring materialization. The essence of these two problems is still the Pipeline scheduling problem in the Push model: how consumers control producers. Apart from Limit and Merge Join, other operations like terminating a query in progress face the same situation. Just like separating the query plan tree from Pipeline enables parallelism-awareness for the Pull model, the Push model does not necessarily have to be implemented exactly as described in papers where only the Pipeline source can be controlled. By introducing mechanisms like ClickHouse’s task_queue, the Push model can similarly achieve level-by-level control of source Operators.

MatrixOne is implemented in Golang, so it directly leverages Go language features to realize the Push model: using channels as blocking message queues to notify producers. A query plan consists of multiple Operators, the pipeline is an execution sequence containing multiple Operators. Operator represents a specific operation, such as a typical Filter, Project, Hash Build and Hash Probe. For a query plan, first determine how many pipelines, how many CUPs, and which pipelines each CUP runs. Specifically, with Golang language features: one pipeline corresponds to one goroutine, pipelines communicate via channels (no buffer), and pipeline scheduling is also driven by channels. An example is as follows:

Connector Operatorfunc Call(proc *process.Process, arg interface{}) (bool, error) {
    ...
    if inputBatch == nil {
        select {
        case <-reg.Ctx.Done():
            process.FreeRegisters(proc)
            return true, nil
        case reg.Ch <- inputBatch:
            return false, nil
        }
    }}

Since it is a Push model, a query plan triggers the entire process through the Producer Pipeline. Non-producer Pipelines will not run if they have not received data. After the Producer Pipeline starts, it will try to read data, then send the data to another Pipeline via channels. The Producer Pipeline will keep reading data after it starts, it will only exit in two cases:

Data reading is complete

Error occurs

When a non-producer pipeline does not read the data pushed by the Producer Pipeline from the channel, the Producer Pipeline will block. Non-producer Pipelines do not execute immediately after startup, unless the Producer Pipeline has placed data in the channel. Pipelines exit after startup under two circumstances:

Received exit message from the channel

Error occurs

MatrixOne will allocate Producer Pipelines to specific nodes based on data distribution. After receiving the Producer Pipeline, the specific node will derive multiple Producer pipelines based on the current machine status and query plan (current machine core count). The parallelism of other Pipelines is determined when receiving data.

Let’s look at a simple query first:

select * from R where a > 1 limit 10

This query has a Limit Operator, meaning there are termination conditions for the Pipeline like Cancel, Limit, Merge Join mentioned above. The Pipeline for this query is shown below, executing in parallel on 2 Cores:

Due to the existence of Limit, the Pipeline introduces the Merge Operator. At the same time, scheduling related issues are:

lMerge cannot accept data from multiple Pipelines unlimitedly. Merge needs to send a channel message to upstream to stop data reading based on memory size through the Connector.

lThe number of Pipelines is determined dynamically based on CPU count. When Pipelines stop pushing data, the query naturally terminates, so Merge needs to flag whether transmission has ended.

Let’s look at a more complex example, tpch-q3:

select
    l_orderkey,
    sum(l_extendedprice * (1 - l_discount)) as revenue,
    o_orderdate,
    o_shippriorityfrom
    customer,
    orders,
    lineitemwhere
    c_mktsegment = 'HOUSEHOLD'
    and c_custkey = o_custkey
    and l_orderkey = o_orderkey
    and o_orderdate < date '1995-03-29'
    and l_shipdate > date '1995-03-29'group by
    l_orderkey,
    o_orderdate,
    o_shippriorityorder by
    revenue desc,
    o_orderdatelimit 10

Assume the query plan is as follow:

Assume the data of these three tables are evenly distributed on two nodes node0 and node1, then the corresponding Pipelines are as follows:

Adopting the Push model also has a potential advantage in maintaining consistency with the Data Flow paradigm of stream computing (such as Flink). FlinkSQL will convert each Operator in the query plan into a streaming Operator, streaming Operators will pass the updates of each Operator’s computation results to the next Operator, which is logically consistent with the Push model. For MatrixOne, which intends to implement a streaming engine internally, this is a place for logical reuse. Of course, implementing a streaming engine is far from just relying on the Push model, which is beyond the scope of this article. One last potential advantage of using the Push model is that it naturally combines with query compilation Codegen. Currently MatrixOne has not implemented Codegen, which is also beyond this article’s scope.

The current MatrixOne implements basic parallel scheduling based on the Push model. In the future, there will be improvements in many aspects, such as scheduling tasks in a hybrid concurrent and parallel way for multiple queries, and when Operators need to perform Spill handling due to insufficient memory, the Pipeline schedule also needs to be aware and handle it efficiently, to complete the task while minimizing IO overhead. There will be many very interesting works in these aspects. We also welcome students interested in this area to explore innovations at these levels with us.

So, is it a question of Push or Pull? It appears to be, and yet it also does not. Everything revolves around practical effects with the focus on computational parallel scheduling, and it’s not simply black and white. It represents a way of thinking about computational parallel scheduling.

Reference

[1] Shaikhha, Amir and Dashti, Mohammad and Koch, Christoph, Push versus pull-based loop fusion in query engines, Journal of Functional Programming, Cambridge University Press, 2018

[2] Leis, Viktor and Boncz, Peter and Kemper, Alfons and Neumann, Thomas, Morsel-driven parallelism: A NUMA-aware query evaluation framework for the many-core age, SIGMOD 2014

[3] Thomas Neumann, Efficiently compiling efficient query plans for modern hardware, VLDB 2011

[4] Switch to Push-Based Execution Model by Mytherin · Pull Request #2393 · duckdb/duckdb (github.com)

[5] ClickHouse
Query Execution Pipeline

MatrixOne Community

Welcome to join MatrixOne community to communicate

Website：MatrixOrigin — Open Source Cloud Native Database MatrixOne | MatrixOrigin

Source code：matrixorigin/matrixone: Hyperconverged cloud-edge native database (github.com)

Slack：MatrixOrigin-Slack

Monitoring AWS EKS and S3 with SkyWalking

YUE GUO — Mon, 13 Mar 2023 07:16:05 +0000

SKyWalking OAP’s existing OpenTelemetry receiver can receive metrics through the OTLP protocol, and use MAL to analyze related metrics in real time. Starting from OAP 9.4.0, SkyWalking has added an AWS Firehose receiver to receive and analyze CloudWatch metrics data. This article will take EKS and S3 as examples to introduce the process of SkyWalking OAP receiving and analyzing the indicator data of AWS services.

EKS

OpenTelemetry Collector

OpenTelemetry (OTel) is a series of tools, APIs, and SDKs that can generate, collect, and export telemetry data, such as metrics, logs, and traces. OTel Collector is mainly responsible for collecting, processing, and exporting. For telemetry data, Collector consists of the following main components:

Receiver: Responsible for obtaining telemetry data, different receivers support different data sources, such as prometheus, kafka, otlp.
Processor: Process data between receiver and exporter, such as adding or deleting attributes.
Exporter: Responsible for sending data to different backends, such as kafka, SkyWalking OAP (via OTLP).
Service: Components enabled as a unit configuration, only configured components will be enabled.

OpenTelemetry Protocol Specification(OTLP)

OTLP mainly describes how to receive (pull) indicator data through gRPC and HTTP protocols. The OpenTelemetry receiver of SKyWalking OAP implements the OTLP/gRPC protocol, and the indicator data can be exported to OAP through the OTLP/gRPC exporter. Usually the data flow of a Collector is as follows:

Monitor EKS with OTel

EKS monitoring is realized through OTel. You only need to deploy OpenTelemetry Collector in the EKS cluster in the way of DaemonSet – use AWS Container Insights Receiver as the receiver, and set the address of otlp exporter to the address of OAP. In addition, it should be noted that OAP is used job_name : aws-cloud-eks-monitoring as the identifier of EKS metrics according to the attribute, so it is necessary to configure a processor in the collector to add this attribute.

OTel Collector configuration demo

extensions:
  health_check:
receivers:
  awscontainerinsightreceiver:
processors:
# To enable OAP to correctly identify EKS metrics, add the job_name attribute
  resource/job-name:
    attributes:
    - key: job_name   
      value: aws-cloud-eks-monitoring
      action: insert     

# Specify OAP as exporters
exporters:
  otlp:
    endpoint: oap-service:11800 
    tls:
      insecure: true
  logging:
      loglevel: debug          
service:
  pipelines:
    metrics:
      receivers: [awscontainerinsightreceiver]
      processors: [resource/job-name]
      exporters: [otlp,logging]
  extensions: [health_check]

By default, SkyWalking OAP counts the network, disk, CPU and other related indicator data in the three dimensions of Node, Pod, and Service. Only part of the content is shown here.

Pod dimensions

Service dimensions

EKS monitoring complete configuration

Click here to view complete k8s resource configuration

apiVersion: v1
kind: ServiceAccount
metadata:
  name: aws-otel-sa
  namespace: aws-otel-eks

---
kind: ClusterRole
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: aoc-agent-role
rules:
  - apiGroups: [""]
    resources: ["pods", "nodes", "endpoints"]
    verbs: ["list", "watch"]
  - apiGroups: ["apps"]
    resources: ["replicasets"]
    verbs: ["list", "watch"]
  - apiGroups: ["batch"]
    resources: ["jobs"]
    verbs: ["list", "watch"]
  - apiGroups: [""]
    resources: ["nodes/proxy"]
    verbs: ["get"]
  - apiGroups: [""]
    resources: ["nodes/stats", "configmaps", "events"]
    verbs: ["create", "get"]
  - apiGroups: [""]
    resources: ["configmaps"]
    resourceNames: ["otel-container-insight-clusterleader"]
    verbs: ["get","update"]
  - apiGroups: ["coordination.k8s.io"]
    resources: ["leases"]
    verbs: ["create","get","update"]    

---
kind: ClusterRoleBinding
apiVersion: rbac.authorization.k8s.io/v1
metadata:
  name: aoc-agent-role-binding
subjects:
  - kind: ServiceAccount
    name: aws-otel-sa
    namespace: aws-otel-eks
roleRef:
  kind: ClusterRole
  name: aoc-agent-role
  apiGroup: rbac.authorization.k8s.io

---
apiVersion: v1
kind: ConfigMap
metadata:
  name: otel-agent-conf
  namespace: aws-otel-eks
  labels:
    app: opentelemetry
    component: otel-agent-conf
data:
  otel-agent-config: |
    extensions:
      health_check:

    receivers:
      awscontainerinsightreceiver:

    processors:
      resource/job-name:
        attributes:
        - key: job_name   
          value: aws-cloud-eks-monitoring
          action: insert     

    exporters:
      otlp:
        endpoint: oap-service:11800
        tls:
          insecure: true
      logging:
          loglevel: debug          

    service:
      pipelines:
        metrics:
          receivers: [awscontainerinsightreceiver]
          processors: [resource/job-name]
          exporters: [otlp,logging]
      extensions: [health_check]    

---

apiVersion: apps/v1
kind: DaemonSet
metadata:
  name: aws-otel-eks-ci
  namespace: aws-otel-eks
spec:
  selector:
    matchLabels:
      name: aws-otel-eks-ci
  template:
    metadata:
      labels:
        name: aws-otel-eks-ci
    spec:
      containers:
        - name: aws-otel-collector
          image: amazon/aws-otel-collector:v0.23.0
          env:
              # Specify region
            - name: AWS_REGION
              value: "ap-northeast-1"
            - name: K8S_NODE_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: HOST_IP
              valueFrom:
                fieldRef:
                  fieldPath: status.hostIP
            - name: HOST_NAME
              valueFrom:
                fieldRef:
                  fieldPath: spec.nodeName
            - name: K8S_NAMESPACE
              valueFrom:
                 fieldRef:
                   fieldPath: metadata.namespace
          imagePullPolicy: Always
          command:
            - "/awscollector"
            - "--config=/conf/otel-agent-config.yaml"
          volumeMounts:
            - name: rootfs
              mountPath: /rootfs
              readOnly: true
            - name: dockersock
              mountPath: /var/run/docker.sock
              readOnly: true
            - name: varlibdocker
              mountPath: /var/lib/docker
              readOnly: true
            - name: containerdsock
              mountPath: /run/containerd/containerd.sock
              readOnly: true
            - name: sys
              mountPath: /sys
              readOnly: true
            - name: devdisk
              mountPath: /dev/disk
              readOnly: true
            - name: otel-agent-config-vol
              mountPath: /conf
            - name: otel-output-vol  
              mountPath: /otel-output
          resources:
            limits:
              cpu:  200m
              memory: 200Mi
            requests:
              cpu: 200m
              memory: 200Mi
      volumes:
        - configMap:
            name: otel-agent-conf
            items:
              - key: otel-agent-config
                path: otel-agent-config.yaml
          name: otel-agent-config-vol
        - name: rootfs
          hostPath:
            path: /
        - name: dockersock
          hostPath:
            path: /var/run/docker.sock
        - name: varlibdocker
          hostPath:
            path: /var/lib/docker
        - name: containerdsock
          hostPath:
            path: /run/containerd/containerd.sock
        - name: sys
          hostPath:
            path: /sys
        - name: devdisk
          hostPath:
            path: /dev/disk/
        - name: otel-output-vol  
          hostPath:
            path: /otel-output
      serviceAccountName: aws-otel-sa

S3

Amazon CloudWatch

Amazon CloudWatch is a monitoring service provided by AWS. It is responsible for collecting indicator data of AWS services and resources. CloudWatch metrics stream is responsible for converting indicator data into stream processing data, and supports output in two formats: json and OTel v0.7.0.

Amazon Kinesis Data Firehose (Firehose)

Firehose is an extract, transform, load (ETL) service that reliably captures, transforms, and serves streaming data into data lakes, data stores (such as S3), and analytics services.

To ensure that external services can correctly receive indicator data, AWS provides Kinesis Data Firehose HTTP Endpoint Delivery Request and Response Specifications (Firehose Specifications) . Firhose pushes Json data by POST

Json data example

{
  "requestId": "ed4acda5-034f-9f42-bba1-f29aea6d7d8f",
  "timestamp": 1578090901599
  "records": [
    {
      "data": "aGVsbG8="
    },
    {
      "data": "aGVsbG8gd29ybGQ="
    }
  ]
}

requestId: Request id, which can achieve deduplication and debugging purposes.
timestamp: Firehose generated the timestamp of the request (in milliseconds).
records: Actual delivery records
data: The delivered data, encoded in base64, can be in json or OTel v0.7.0 format, depending on the format of CloudWatch data (described later). Skywalking currently supports OTel v0.7.0 format.

aws-firehose-receiver

aws-firehose-receiver provides an HTTP Endpoint that implements Firehose Specifications: /aws/firehose/metrics. The figure below shows the data flow of monitoring DynamoDB, S3 and other services through CloudWatch, and using Firehose to send indicator data to SKywalking OAP.

Step-by-step setup of S3 monitoring

Enter the S3 console and create a filter forRequest metrics: Amazon S3 >> Buckets >> (Your Bucket) >> Metrics >> metrics >> View additional charts >> Request metrics

Enter the Amazon Kinesis console, create a delivery stream, Source select Direct PUT, Destination select HTTP Endpoint. And set HTTP endpoint URL to https://your_domain/aws/firehose/metrics. Other configuration items:

Buffer hints: Set the size and period of the cache Access key just matches the AccessKey in aws-firehose-receiver
Retry duration: Retry period
Backup settings: Backup settings, optionally backup the posted data to S3 at the same time.

Enter the CloudWatch console Streams and click Create CloudWatch Stream. And Select your Kinesis Data Firehose stream configure the delivery stream created in the second step in the item. Note that it needs to be set Change output format to OpenTelemetry v0.7.0.

At this point, the S3 monitoring configuration settings are complete. The S3 metrics currently collected by SkyWalking by default are shown below:

Other service

Currently SkyWalking officially supports EKS, S3, DynamoDB monitoring. Users also refer to the OpenTelemetry receiver to configure OTel rules to collect and analyze CloudWatch metrics of other AWS services, and display them through a custom dashboard.

Material

Original link:https://skywalking.apache.org/blog/2023-03-12-skywalking-aws-s3-eks/

点击阅读中文版文章：https://dev.amazoncloud.cn/column/article/640e8ae886a0725236ae9695?sc_channel=devto

Time-Slicing GPUs with Karpenter

YUE GUO — Wed, 14 Dec 2022 08:17:22 +0000

Time-Slicing GPUs with Karpenter

Arthor: Ran Tao, Cloud architect @Jina AI

This article is originally published on Jina AI News.

Today, businesses and developers are keen to use cloud for deep learning. Especially with the GPU cloud instances, you pay as you go. It is much more cost-efficient comparing to having an expensive metal machine in the office. But let's switch the role now. Say you are the GPU cloud provider, and you provide the GPU environment for hosting other users applications. The problem now becomes, how can you, as this platform provider, lower down the GPU costs to maximize the profit? This is not abou

But let's switch the role now. Say you are the GPU cloud provider, and you provide the GPU environment for hosting other users applications. The problem now becomes, how can you, as this platform provider, lower down the GPU costs to maximize the profit?

This is not about finding the cheapest GPU vendors. In fact, it is the question we were facing at Jina AI when designing our GPU cloud platform.

Jina AI Cloud Hosting After building a Jina project, the next step is to deploy and host it on the cloud. Jina AI Cloud is Jina’s reliable, scalable and production-ready cloud-hosting solution that manages your project lifecycle without surprises or hidden development costs.

The answer is time-slicing.

💡Time-slicing allows oversubscription of GPUs. Under the hood, CUDA time-slicing is used to allow workloads that land on oversubscribed GPUs to interleave with one another. Each workload has access to the GPU memory and runs in the same fault-domain as of all the others

In this article, we will use Karpenter - an elastic node scaling method in Kubernetes and NVIDIA’s k8s plugin to achieve time-slicing on GPUs. A GPU cloud with time-slicing will allow users to share GPUs between pods, hence saves the costs.

[Karpenter](https://karpenter.sh)

[K8s-device-plugin](https://github.com/NVIDIA/k8s-device-plugin)

Karpenter itself provides an auto scaling feature to nodes, which means that you will have the GPU instance only when you need it and can schedule the node based on the instance type you configured. It saves you money and schedules nodes more effectively.

The purpose of utilizing the GPU with Karpenter is not only saving cost, but more importantly, it also provides us a flexible method to schedule GPU resources to our applications within the kubernetes cluster. You may own tens of applications which need the GPU in different time slots, how to schedule them in a more cost effective way is so important in the cloud.

Architecture

Infrastructure diagram

Component diagram

It’s pretty straightforward: the application will choose a karpenter provisioner with a selector. The karpenter provisioner will create nodes based on the launch template in that provisioner.

Deployment

Building the architect is simple, the problem we are left with is how we are going to deploy it. There are some particulars we need to think about.

How we deploy the nvidia k8s plugin to the nodes with GPU only.
How we configure the shared GPU nodes to use time-slicing without affecting others.
How do we automatically update nodes AMI in the launch template so the nodes can use the latest image.
How do we setup karpenter provisioners

Let’s do it one by one then.

First, install karpenter and setup provisioner with terraform. You can manually install karpenter in eks with an official document as well. If you already have eks with karpenter, you can skip it.

[tarrantrom](https://github.com/tarrantro/terraform)

Set provisioner

The Provisioners is set to use corelated launch templates to provision GPU nodes with labels and taints.

resource "kubectl_manifest" "karpenter_provisioner_gpu_shared" {
  yaml_body = <<-YAML
  apiVersion: karpenter.sh/v1alpha5
  kind: Provisioner
  metadata:
    name: gpu-shared
  spec:
    ttlSecondsAfterEmpty: 300
    labels:
      jina.ai/node-type: gpu-shared
      jina.ai/gpu-type: nvidia
      nvidia.com/device-plugin.config: shared_gpu
    requirements:
      - key: node.kubernetes.io/instance-type
        operator: In
        values: ["g4dn.xlarge", "g4dn.2xlarge", "g4dn.4xlarge", "g4dn.12xlarge"]
      - key: karpenter.sh/capacity-type
        operator: In
        values: ["spot", "on-demand"]
      - key: kubernetes.io/arch
        operator: In
        values: ["amd64"]
    taints:
      - key: nvidia.com/gpu-shared
        effect: "NoSchedule"
    limits:
      resources:
        cpu: 1000
    provider:
      launchTemplate: "karpenter-gpu-shared-${local.cluster_name}"
      subnetSelector:
        karpenter.sh/discovery: ${local.cluster_name}
      tags:
        karpenter.sh/discovery: ${local.cluster_name}
    ttlSecondsAfterEmpty: 30
  YAML

  depends_on = [
    helm_release.karpenter
  ]
}

resource "kubectl_manifest" "karpenter_provisioner_gpu" {
  yaml_body = <<-YAML
  apiVersion: karpenter.sh/v1alpha5
  kind: Provisioner
  metadata:
    name: gpu
  spec:
    ttlSecondsAfterEmpty: 300
    labels:
      jina.ai/node-type: gpu
      jina.ai/gpu-type: nvidia
    requirements:
      - key: node.kubernetes.io/instance-type
        operator: In
        values: ["g4dn.xlarge", "g4dn.2xlarge", "g4dn.4xlarge", "g4dn.12xlarge"]
      - key: karpenter.sh/capacity-type
        operator: In
        values: ["spot", "on-demand"]
      - key: kubernetes.io/arch
        operator: In
        values: ["amd64"]
    taints:
      - key: nvidia.com/gpu
        effect: "NoSchedule"
    limits:
      resources:
        cpu: 1000
    provider:
      launchTemplate: "karpenter-gpu-${local.cluster_name}"
      subnetSelector:
        karpenter.sh/discovery: ${local.cluster_name}
      tags:
        karpenter.sh/discovery: ${local.cluster_name}
    ttlSecondsAfterEmpty: 30
  YAML

  depends_on = [
    helm_release.karpenter
  ]
}

Provisioner

Launch template (only GPU): gpu_launchtemplate.hcl

Add time-slicing config

Secondly, we need to deploy the NVIDIA k8s plugin with time-slicing config and default config and set up a node selector so the daemonset will only run on the GPU instances.

config:
  # ConfigMap name if pulling from an external ConfigMap
  name: ""
  # Set of named configs to build an integrated ConfigMap from
  map: 
    default: |-
      version: v1
      flags:
        migStrategy: "none"
        failOnInitError: true
        nvidiaDriverRoot: "/"
        plugin:
          passDeviceSpecs: false
          deviceListStrategy: envvar
          deviceIDStrategy: uuid
    shared_gpu: |-
      version: v1
      flags:
        migStrategy: "none"
        failOnInitError: true
        nvidiaDriverRoot: "/"
        plugin:
          passDeviceSpecs: false
          deviceListStrategy: envvar
          deviceIDStrategy: uuid
      sharing:
        timeSlicing:
          renameByDefault: false
          resources:
          - name: nvidia.com/gpu
            replicas: 10
nodeSelector: 
  jina.ai/gpu-type: nvidia

nvdp.yaml

Run the below command to install NVIDIA’s k8s plugin:

helm repo add nvdp https://nvidia.github.io/k8s-device-plugin
helm repo update
helm upgrade -i nvdp nvdp/nvidia-device-plugin \  --namespace nvidia-device-plugin \  --create-namespace -f nvdp.yaml

Deploy user application

Third, deploy the user application with nodeSelector and toleration.

kind: Deployment
apiVersion: apps/v1
metadata:
  name: test-gpu
  labels:
    app: gpu
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gpu
  template:
    metadata:
      labels:
        app: gpu
    spec:
      nodeSelector:
        jina.ai/node-type: gpu
        karpenter.sh/provisioner-name: gpu
      tolerations:
      - key: nvidia.com/gpu
        operator: Exists
        effect: NoSchedule
      containers:
      - name: gpu-container
        image: tensorflow/tensorflow:latest-gpu
        imagePullPolicy: Always
        command: ["python"]
        args: ["-u", "-c", "import tensorflow"]
        resources:
          limits:
            nvidia.com/gpu: 1

gpu.yml

kind: Deployment
apiVersion: apps/v1
metadata:
  name: test-gpu-shared
  labels:
    app: gpu-shared
spec:
  replicas: 1
  selector:
    matchLabels:
      app: gpu-shared
  template:
    metadata:
      labels:
        app: gpu-shared
    spec:
      nodeSelector:
        jina.ai/node-type: gpu-shared
        karpenter.sh/provisioner-name: gpu-shared
      tolerations:
      - key: nvidia.com/gpu-shared
        operator: Exists
        effect: NoSchedule
      containers:
      - name: gpu-container
        image: tensorflow/tensorflow:latest-gpu
        imagePullPolicy: Always
        command: ["python"]
        args: ["-u", "-c", "import tensorflow"]
        resources:
          limits:
            nvidia.com/gpu: 1

gpu-shared.yml

Validate the results

Now, if you deploy both YAML files. You will see two nodes provisioned in AWS console or you can see via use kubectl get nodes — show-labels. After the nvidia-k8s-plugin is running in each nodes, you can test in your applications.

The result showing in the AWS EC2 console

If you like this article or want to learn more about the architecture behind Jina AI Cloud, make sure to follow us on social channels and subscribe to our blog.

How to run Apache SkyWalking on AWS EKS and RDS/Aurora

YUE GUO — Tue, 13 Dec 2022 11:59:43 +0000

Original link:https://skywalking.apache.org/blog/2022-12-13-how-to-run-apache-skywalking-on-aws-eks-rds/

Introduction

Apache SkyWalking is an open source APM tool for monitoring and troubleshooting distributed systems, especially designed for microservices, cloud native and container-based (Docker, Kubernetes, Mesos) architectures. It provides distributed tracing, service mesh observability, metric aggregation and visualization, and alarm.

In this article, I will introduce how to quickly set up Apache SkyWalking on AWS EKS and RDS/Aurora, as well as a couple of sample services, monitoring services to observe SkyWalking itself.

Prerequisites

AWS account

We can use the AWS web console or CLI to create all resources needed in this tutorial, but it can be too tedious and hard to debug when something goes wrong. So in this artical I will use Terraform to create all AWS resources, deploy SkyWalking, sample services, and load generator services (Locust).

Architecture

The demo architecture is as follows:

As shown in the architecture diagram, we need to create the following AWS resources:

EKS cluster
RDS instance or Aurora cluster

Sounds simple, but there are a lot of things behind the scenes, such as VPC, subnets, security groups, etc. You have to configure them correctly to make sure the EKS cluster can connect to RDS instance/Aurora cluster otherwise the SkyWalking won’t work. Luckily, Terraform can help us to create and destroy all these resources automatically.

I have created a Terraform module to create all AWS resources needed in this tutorial, you can find it in the GitHub repository.

Create AWS resources

First, we need to clone the GitHub repository and cd into the folder:

git clone https://github.com/kezhenxu94/oap-load-test.git

Then, we need to create a file named terraform.tfvars to specify the AWS region and other variables:

cat > terraform.tfvars <<EOF
aws_access_key = ""
aws_secret_key = ""
cluster_name   = "skywalking-on-aws"
region         = "ap-east-1"
db_type        = "rds-postgresql"
EOF

If you have already configured the AWS CLI, you can skip the aws_access_key and aws_secret_key variables. To install SkyWalking with RDS postgresql, set the db_type to rds-postgresql, to install SkyWalking with Aurora postgresql, set the db_type to aurora-postgresql.

There are a lot of other variables you can configure, such as tags, sample services count, replicas, etc., you can find them in the variables.tf.

Then, we can run the following commands to initialize the Terraform module and download the required providers, then create all AWS resources:

terraform init
terraform apply -var-file=terraform.tfvars

Type yes to confirm the creation of all AWS resources, or add the -auto-approve flag to the terraform apply to skip the confirmation:

terraform apply -var-file=terraform.tfvars -auto-approve

Now what you need to do is to wait for the creation of all AWS resources to complete, it may take a few minutes. You can check the progress of the creation in the AWS web console, and check the deployment progress of the services inside the EKS cluster.

Generate traffic

Besides creating necessary AWS resources, the Terraform module also deploys SkyWalking, sample services, and Locust load generator services to the EKS cluster.

You can access the Locust web UI to generate traffic to the sample services:

open http://$(kubectl get svc -n locust -l app=locust-master -o jsonpath='{.items[0].status.loadBalancer.ingress[0].hostname}'):8089

The command opens the browser to the Locust web UI, you can configure the number of users and hatch rate to generate traffic.

Observe SkyWalking

You can access the SkyWalking web UI to observe the sample services.

First you need to forward the SkyWalking UI port to local

kubectl -n istio-system port-forward $(kubectl -n istio-system get pod -l app=skywalking -l component=ui -o name) 8080:8080

And then open the browser to http://localhost:8080 to access the SkyWalking web UI.

Observe RDS/Aurora

You can also access the RDS/Aurora web console to observe the performance of RDS/Aurora instance/Aurora cluste.

Test Results

Test 1: SkyWalking with EKS and RDS PostgreSQL

Service Traffic

RDS Performance

SkyWalking Performance

Test 2: SkyWalking with EKS and Aurora PostgreSQL

Service Traffic

RDS Performance

SkyWalking Performance

Clean up

When you are done with the demo, you can run the following command to destroy all AWS resources:

terraform destroy -var-file=terraform.tfvars -auto-approve

The Paradigm Shift Towards Multimodal AI Jina AI MLOps for Multimodal AI Neural Search and Creative AI

YUE GUO — Mon, 05 Dec 2022 11:28:46 +0000

The Paradigm Shift Towards Multimodal AI

Arthor: Jina AI Founder & CEO Han.Xiao

This article is originally published on Jina AI News.

Excerpt: We are on the cusp of a new era in AI, one in which multimodal AI will be the norm. At Jina AI, our MLOps platform helps businesses and developers win while they're right at the starting line of this paradigm shift, and build the applications of the future today.

Every time I introduce Jina AI and explain what we do, I switch up my narrative depending on who I'm talking to:

1️⃣ Jina AI is the MLOps platform for cross-modal and multimodal data.

2️⃣ Jina AI is the MLOps platform for neural search and creative AI applications.

The first narrative is data-driven and academic-oriented, aimed at AI researchers. The second is more application-driven and intuitive for practitioners and industry partners. Whatever the narrative, four terms are new to most people:

Cross-modal
Multimodal
Neural search
Creative AI

Some people have heard of unstructured data, but what's multimodal data? Some have heard of semantic search, but what on Earth is neural search?

Most confusingly of all, why lump these four terms together, and why is Jina AI working on one MLOps platform to cover them all?

This article answers those questions. But I'm impatient. Let's fast-forward to the conclusion: The AI industry has shifted away from single-modal AI and has entered the era of multimodal AI, as illustrated below:

Jina AI spectrum on future AI applications

At Jina AI, our spectrum encompasses cross-modal, multimodal, neural search, and creative AI, covering a significant portion of future AI applications. Our MLOps platform gives businesses and developers the edge while they're right at the starting line of this paradigm shift, and build the applications of the future today.

In the next sections, we'll review the development of single-modal AI and see how this paradigm shift is happening right beneath our noses.

Single-Modal AI

In computer science, "modality" roughly means "data type". When we talk about single-modal AI, we're talking about applying AI to one specific type of data. Most early machine learning works fall into this category. Even today, when you open any machine learning literature, single-modal AI is still the majority of the content.

Natural Language Processing

We'll start our look back with natural language processing (NLP). Back in 2010, I published a paper about an improved Gibbs sampling algorithm for the Latent Dirichlet Allocation (LDA) model:

Efficient Collapsed Gibbs Sampling For Latent Dirichlet Allocation, 2010

Some old machine learning researchers may still remember LDA: a parametric Bayesian model for modeling text corpora. It "clusters" words into topics and represents each document as a combination of topics. For this reason, some people called it a "topic model".

From 2008 to 2012, the topic model was one of the most effective and popular models in the NLP community – it was the BERT/Transformer of its day. Every year at top-tier ML/NLP conferences, many papers would extend or improve the original model. But looking back on it today, it was a pretty "shallow learning" model with a very ad-hoc language modeling approach. It assumed words were generated from a mixture of multinomial distributions. This makes sense for certain specific tasks but isn't general enough for other tasks, domains, or modalities.

Back in 2010-2020, ad-hoc approaches like this were the norm in NLP. Researchers and engineers developed specialist algorithms, each of which was good at solving one task, and one task only:

Top 20 most common NLP tasks

Computer Vision

Compared to NLP, I came to the field of computer vision (CV) pretty late. While at Zalando in 2017, I published a paper on the Fashion-MNIST dataset. This dataset is a drop-in replacement of Yann LeCun's original MNIST dataset from 1990 (a set of simple handwritten digits for benchmarking computer vision algorithms.) The original MNIST dataset was too trivial for many algorithms – shallow learning algorithms such as logistic regression, decision trees, and support vector machines could easily hit 90% accuracy, leaving little room for deep learning algorithms to shine.

Fashion-mnist: a novel image dataset for benchmarking machine learning algorithms, 2017

Fashion-MNIST provided a more challenging dataset, allowing researchers to explore, test, and benchmark their algorithms. Today, over 5,000 academic papers have cited Fashion-MNIST in their research on classification, regression, denoising, generation, etc.

Just as the topic model was only good for NLP, Fashion-MNIST was only good for computer vision. There was almost no information in the dataset that you could leverage for studying other modalities. If you look at common tasks in the CV community between 2010-2020, practically all were single-modality. Like NLP, they all covered one task, and one task only:

Top 20 most common CV tasks

Speech & Audio

Speech and audio machine learning followed the same pattern: Algorithms were designed for ad-hoc tasks around the audio modality. They each performed (all together now!) one task, and one task only:

Top 20 most common acoustic tasks

One of my earliest attempts at multimodal AI was a paper I published in 2010, where I built a Bayesian model that jointly modeled visual, textual, and acoustic modalities. Once trained, it accomplished two cross-modal retrieval tasks: finding the best-matching images from a sound snippet and vice-versa. I gave these two tasks a very sci-fi name: "artificial synesthesia".

Toward Artificial Synesthesia: Linking Images and Sounds via Words, 2010

Towards Multimodal AI

From the examples above, we can see that all single-modal AI algorithms have two things in common:

Tasks are specific to just one modality (e.g. textual, visual, acoustic, etc).
Knowledge is learned from and applied to only one modality (i.e. a visual algorithm can only learn from and be applied to images).

So far I have talked about text, image, audio. There are other modalities such as 3D, video, time series which should be considered as well. If we visualize all tasks from different modalities: We get a cube, where modalities are arranged orthogonally:

Single-modal AI, you can assume the other facets represent the tasks from other modalities

On the other hand, multimodal AI is like molding this cube into a sphere, erasing boundaries between different modalities, where:

Tasks are shared and transferred between multiple modalities (so one algorithm can work with images and text and audio).
Knowledge is learned from and applied to multiple modalities (so an algorithm can learn from textual data and apply that to visual data).

Multimodal AI

The rise of multimodal AI can be attributed to advances in two machine learning techniques: Representation learning and transfer learning.

Representation learning lets models create common representations for all modalities.
Transfer learning lets models first learn fundamental knowledge, and then fine-tune on specific domains.

Without these techniques, multimodal AI on generic data types would be unfeasible or merely a toy, just like my sound-image paper from back in 2010.

In 2021 we saw CLIP, a model that captures the alignment between image and text; in 2022, we saw DALL·E 2 and Stable Diffusion generate high-quality images from text prompts.

The paradigm shift has already started: In the future we'll see more and more AI applications move beyond one data modality and leverage relationships between different modalities. Ad-hoc approaches are dying out as boundaries between data modalities become fuzzy and meaningless:

The paradigm shift from single-modal AI to multimodal AI

The Duality of Search & Creation

Search and creation are two essential tasks in multimodal AI. Here, search means neural search, namely searching using deep neural networks. For most people, these two tasks are entirely isolated, and they have been independently studied for many years. Let me point out this: search and creation are strongly connected and share a duality. To understand this, let's look at the following examples.

With multimodal AI, it's simple to use a text or image query to search a dataset of images:

Search: _find_ what you need

Creation is similar. You create a new image from a text prompt or by enriching/inpainting from an existing image:÷

Create: _make_ what you need

When grouping these two tasks together and masking out their function names, you can see the two tasks are indistinguishable: Both receive and output the same data type(s). The only difference is that search finds what you need, whereas creation makes what you need:

DNA is a good analogy: Once you have an organism's DNA, you can build a phylogenetic tree and search for the oldest and most primitive known ancestor. On the other hand, you could inject the DNA into an egg and create something new, just as David creates his own alien creature:

The duality of search and creation under the multimodal AI framework. Movie poster from "Alien: Covenant"

Or think of Doraemon and Rick. Both help their sidekick with out-of-this-world items to solve their problems. The difference is that Doraemon searches through his pocket for an existing item, whereas Rick creates something new from his garage workshop.

Doraemon represent neural search, whereas Rick represents creative AI

The duality of search and creation also poses an interesting thought experiment. Imagine living in a world where all images are created by AI rather than humans. Do we still need (neural) search? Namely, do we need to embed images into vectors and then use a vector database to index and sort them? The answer is NO. As the seed and prompts that uniquely represent the image are known before observing the image, the consequence now becomes the cause. Contrast this to the classic representation, where learning an image is the cause and representation is the consequence. To search images, we can simply store the seed (an integer) and the prompt (a string), which is nothing more than a good old BM25 or binary search. Of course, we humans appreciate photography and human-made artworks, so that parallel Earth is not our reality (yet). Nonetheless, this thought experiment gives a good reason why neural searchers should care about advances in creative AI, as the old way of handling multimodal data may become obsolete.

Summary

We are on the frontier of a new era in AI, where multimodal learning will soon dominate. This type of learning, combining multiple data types and modalities, has the potential to revolutionize the way we interact with machines. So far, multimodal AI has had great success in fields like computer vision and natural language processing. In future, expect multimodal AI to have an even greater impact. For instance, developing systems that understand the nuances of human communication, or creating more lifelike virtual assistants. The possibilities are endless, and we're just beginning to scratch the surface. So strap in and buckle up for the future, because the best is yet to come!

Want to work on multimodal AI, neural search, and creative AI? Join us and help lead the revolution!

Follow us on GitHub
Join our community

Deployment through EKS add-ons: Simplifying Istio on EKS

YUE GUO — Wed, 30 Nov 2022 08:28:12 +0000

Today, we are excited to announce that Tetrate Istio Distro (TID) is deployable as an add-on for Amazon EKS. Istio is the de facto standard for running service mesh on top of Kubernetes, and AWS EKS is one of the most popular ways to run Kubernetes. Tetrate’s TID is the first Istio distro deployable using EKS add-on commands, and this new capability makes it super simple to run Tetrate’s hardened, FIPS-compliant, and fully upstream Istio distribution everywhere EKS is available. By making TID available as an add-on, we are making it easier for organizations to productionize Istio, reduce operational complexity, and ultimately achieve application security and modernization goals faster. In this blog, I will show you how to get started with this new EKS add-on.

Why Run TID as an EKS Add-on?

When you want to deploy Istio in production, the first question is where to get your Istio distribution. Tetrate Istio Distro is Tetrate’s hardened, performant, and fully upstream Istio distribution. Teams often choose to run TID because it’s built by Tetrate’s Istio experts (in addition to being co-creators of Istio, we also built the official CNCF course on Istio). TID support and FIPS certificate are available on the AWS Marketplace as a paid subscription service (Tetrate Istio Subscription). It’s a great way to get started with Istio knowing you have a trusted distribution to begin with, have an expert team supporting you, and also have the option to get to FIPS compliance quickly if you need to.

As for how to run Istio, that’s where the EKS add-on comes in. If you are not familiar with EKS add-ons, they provide supporting operational capabilities to Kubernetes applications built right into the EKS workflow. Add-ons often provide installation and management of important capabilities for Amazon EKS clusters, they always include the latest security patches, bug fixes, and are validated by AWS to work with Amazon EKS.

Even though Istio is widely deployed, it has never been deployable through EKS add-ons until now. Not being an add-on means a little more work to deploy and manage Istio—such as getting the distro on AWS Marketplace and deploying via additional scripts, which also translates to more work operationally with some overhead down the road.

Now that TID is deployable through EKS add-ons, you get all the benefits of TID and EKS add-on in one package. AWS has fully tested and integrated TID into the EKS workflow, which means that deploying TID is now possible in EKS configuration, and turning it on can be as simple as one command line argument. This vastly simplifies the overhead of deploying and managing Istio over your EKS footprint, and best of all, it’s all validated by EKS experts from AWS engineering and Istio experts from Tetrate.

A Short Tutorial to Get Started

To deploy Tetrate Istio Distro on a new EKS cluster, customers have two options: the AWS web console and the command line. We’ll cover both approaches here, starting with the AWS web console.

Deploy the TID Add-on for Amazon EKS with the AWS Web Console

The AWS web console provides an intuitive way to deploy the add-on in an EKS cluster.

Step 1: Log into the AWS console and navigate to your EKS cluster. Then, pick the Tetrate Istio Distribution add-on as shown in the screenshot below (Figure 1):

Figure 1: Select the Tetrate Istio Distro add-on for EKS from AWS Marketplace.

Step 2: After the add-on is selected, you will be presented with a configuration screen. The Istio deployment is simple and doesn’t require extensive settings. The final pre-deployment screen will look like this (Figure 2):

Figure 2: Review configuration and add.

Step 3: The deployment will start as soon as you confirm the configuration (Figure 2) and click the “Create” button. You can monitor the deployment status in the AWS console as shown below (Figure 3):

Figure 3: Note the deployment status.

Step 4: Wait until the deployment is complete and the status changes to “Active.” This may take about two minutes.

Congratulations—you have deployed an enterprise-grade Istio distro to your Amazon EKS cluster!

Deploy the TID Add-on for Amazon EKS from the Command Line

For repeatable cluster deployments, it makes sense to automate via command-line instructions. Follow the simple steps below to deploy the TID add-on for EKS via the aws CLI.

Note: You will need to subscribe to TID in the AWS Marketplace first before the TID add-on can be
deployed in your AWS account.

Step 1: Check to make sure that the add-on is available with the following command:

aws eks describe-addon-versions --addon-name tetrate-io_istio-distro

If the TID add-on is available, you should see output similar to the following (Figure 5):

Figure 5: JSON description of the TID add-on for Amazon EKS via the aws eks describe-addon-versions command.

Step 2: Deploy the TID add-on to your EKS cluster using the following command:

aws eks create-addon --addon-name tetrate-io_istio-distro --cluster-name <CLUSTER_NAME>

Figure 6: Sample output from the aws eks create-addon command showing the add-on being created.

Step 3: Wait for the TID add-on to be deployed. This may take about two minutes. To get the current state use the following command:

aws eks describe-addon --addon-name tetrate-io_istio-distro --cluster-name

Figure 7: Sample output of the aws eks describe-addon to monitor deployment status.

Step 4: Confirm that Istio has been deployed to your cluster by running the following command in the Kubernetes context to see that Istio pods are in “Running” state:

kubectl get pods -n istio-system

Figure 8: Confirm Istio deployment via the kubectl command.
Congratulations—you have deployed an enterprise-grade Istio distro to your EKS cluster with a simple, repeatable set of command-line steps.

What’s next

You can try and deploy TID through EKS add-ons here. If you want a deeper dive into how this all works, sign up for this workshop. If you want to try Tetrate Istio Distro on its own, you can find it here. Contact us if you’d like the FIPS-compliant version. Once you have Istio up and running, you will probably need simpler ways to manage and secure your services beyond what’s available in Istio, that’s where Tetrate Service Bridge comes in. You can learn more about how Tetrate Service Bridge makes service mesh more secure, manageable, and resilient here, or contact us for a quick demo.

Authors

Petr McAllister

Saptak Sen

Quantum programming framework ——Amazon Braket SDK and QuTrunk

YUE GUO — Wed, 30 Nov 2022 08:02:45 +0000

KY1,Bertran Shao2,Don Tang3
1.AWS HERO;2.Head of Developer Relations of QUDOOR;3.Developer Relations of QUDOOR.

In the field of classical computing programming, software framework is an abstract form with general software functions. Developers using these software frameworks can provide customized services for specific applications after rewriting the framework code according to specific functional requirements. In the era of big data, classical computing is slightly weak to solve massive data. Quantum computing stands out in terms of computational acceleration due to its entanglement and superposition feature. The research and design of quantum programming software in the industry generally inherits the classical programming ideas to show the advantages of quantum computing.

At present, many scientific and technological enterprises at home and abroad have launched quantum programming framework software, such as ProjectQ,(an open source software framework for quantum computing), Qiskit,( an open source quantum programming development kit from IBM ), Cirq ( an open source quantum algorithm framework from Google), AmazonBraket SDK, and QuTrunk, ( a quantum programming framework from QUDOOR). The next mainly introduces the Amazon Braket SDK quantum software development kit and QUDOOR's QuTrunk quantum programming framework .

1. What is a quantum programming framework

Quantum programming framework focuses more on the rapid development of quantum subprograms under the current technical conditions. Quantum programming frameworks usually use traditional programming languages as host languages, add variables, functions, objects and other elements that can describe the quantum computing system. And then developers can implement quantum algorithms and develop quantum programs in the quantum programing frameworks. Quantum programming frameworks often contain software libraries of commonly used quantum algorithms, which will be more efficient for the development of quantum programs. Quantum and classical hybrid programming can be easily realized using the host language of quantum programming framework.

Common quantum programming frameworks include QPanda, QDK, Cirq, Qiskit, ProjectQ, Forest, AmazonBraketSDK and QuTrunk. Because the quantum programming framework introduces the concepts of quantum computing in the classical host program language, and in a quantum computer QPU is just a device similar to CPU.This will be a familiar development paradigm for developers. After the program developed by the quantum programming framework is compiled, the classical program codes are converted into machine instructions and subsequently executed on the classical processor, while the quantum circuit codes describing the quantum algorithm is converted into the quantum intermediate representation and subsequently sent to the QPU for processing.

Quantum programming framework software is used to write quantum algorithms and programs running in quantum computers. After encapsulation, it can also provide commonly used quantum computing components and quantum algorithm libraries for rapid development of quantum programs.

2. Amazon Braket SDK

2.1 Amazon Braket
Amazon Braket is a full-service Amazon Web Services (AWS) solution launched by Amazon, which can help researchers and developers explore potential applications and evaluate current quantum computing technologies. It allows users to design their own quantum algorithms, or choose quantum algorithms from algorithm libraries. After defining the algorithm, Amazon Braket will provide a fully hosted simulator to help troubleshoot .
(1) Build, install Jupyter of Amazon Braket SDK, perform quantum programming (2) Test, run quantum circuits in the simulator. Braket supports four kinds of simulators, among which the local simulator can simulate the quantum environment locally
(3) Run, run quantum algorithms in the real quantum environment. At present, AWS quantum devices include D-Wave, IonQ, OQC. Because the influence of quantum
computer noise cannot be completely removed at present, AWS provides a hybrid quantum environment, works with CPU through QPU, and supports hybrid algorithm through PennyLane .

How Amazon Braket Works

(1) Learn
AmazonBraket, which provides step-by-step instructions, tutorials, and resource libraries help you quickly start experimenting with quantum computing.

(2) Design
To design quantum algorithms, you can use a fully hosted Jupyter notebook directly in the Amazon Braket . The sample gives you access to pre-installed developer tools, sample algorithms, and documentation to get started quickly.

(3)Testing
You can use simulators running on traditional hardware to simplify code diagnosis and design optimization, thus speeding up algorithm development. Amazon Braket runs the simulators in a fully hosted service mode, and then automatically sets the required computing instance,and publishes the results to Amazon S3, and closes the resources when it is finished.

(4)Run
You can perform quantum algorithms on the quantum hardware you have done a
choice, and pay according to the actual usage. If you choose to run a hybrid quantum algorithm, Amazon Braket can automatically set the required classical computing resources and manage the workflow between classical and quantum tasks.

(5)Analysis
After the analysis is completed, the system will automatically notify you and store the results in Amazon S3. Amazon Braket will publish event logs and performance indicators such as completion status and running time to Amazon CloudWatch.

Amazon Braket functions

Braket Task Flow

(1) To make it easy for customers to define, submit, and monitor their tasks, Amazon Braket provides an environment with Jupyter notebooks.

(2) Amazon Braket Development Kit. You can directly build quantum circuits in SDK, or define annealing problems and parameters for annealing equipment. Amazon BraketSDK also provides a plug-in D-Wave's Ocean tool suite so that you can program D-Wave devices natively. After defining the task, you can select the device to execute on and submit it to Amazon Braket API.

(3)Depending on the device you choose, the task will be sent to the QPU or emulator for execution until the device becomes available.

(4)Amazon Braket allows you to access five different types of QPUs(D-Wave, IonQ, OQC, Xanadu, Rigetti) and three simulators (SV1, DM1, TN1). After your task is processed, Amazon Braket returns the results to the Amazon S3 , and the data is stored in your AWS account.

(5) At the same time, the SDK will poll the results in the background and load them into the Jupyter notebook when the task is completed. You can also view and manage a page in your task task page on the Amazon Braket console or use the console or the operations Amazon Braket API of GetQuantumTask. Amazon Braket integrates with Amazon Identity and Access ManagAmazon CloudWatch, Amazon CloudTrail and Amazon Event Bridge for access management, monitoring and recording, and event based processing.

2.2 Braket Python SDK
Amazon Braket Python SDK is a quantum software development package, which provides a framework that can be used to interact with quantum computing hardware devices through Amazon Braket. Amazon Braket can be used to run its quantum algorithms on quantum processors based on different technologies, including systems from D-Wave, IonQ and Rigetti. Simulation and quantum hardware operations are managed through unified development experience, and customers only need to pay for the computing resources. Based on the resources of the Amazon Braket, part of the Amazon Braket SDK provides a free local simulator, which is suitable for running small and medium-sized simulations (usually up to 25 qubit). Customers can run on a quantum computer simulator using Amazon EC2 computing resources, test the algorithms and troubleshoot . When ready, customers can run their algorithms on quantum computers based on their choice without having to hire multiple providers or adopt a single technology.

For larger and more complex algorithms that require high-performance computing resources (up to 34 qubits), you can submit simulation tasks to the Braket. The cost of using the Braket simulator depends on the duration of each simulation task. The billing method is billed at an hourly rate, in one-second increments, and is ultimately settled by the time it takes to execute the simulation.

2.3 Local simulated quantum environment
You can program quantum circuits by installing Jupyter locally
Installing AWS Brake SDk
pip install amazon-braket-sdk

（1）Using Jupyter to run Bell algorithm locally.

（2）Visualization results can be displayed using the visualization library

AWS can support local simulors, as well as efficient and more qubit capable simulator services.

2.4 Features of Amazon Braket SDK
（1）Moment Features. There is a pseudo-time concept in Braket, defined as moment, which means that a qubit performs a gate operation at the current moment.. The moment attribute is to disassemble a circuit in units of gates, so as to obtain the temporary state of the circuit at a moment, and analyze the running process of the algorithm.. The code is shown in the following figure:

（2）The types of results returned by BraketSDK: Braket returns a result that supports multiple types. For different backends, the returned result types are also different, as shown in the following table:

（3）The Verbatim compilation compiler does not optimize. Every step of the algorithm is translated accurately to ensure that the implementation of the algorithm is consistent with its design. (In the compilation stage, that is, the gate operation in the circuit is translated into the QPU native gate stage, and the Branch compiler will optimize by default.) .Application scenarios of this function: such as benchmarking the performance of hardware.
（4）Support the OpenQASM 3.0 .
（5）Amazon Braket supports access to quantum connectivity query interfaces for different quantum computer platforms

（6）Hybrid jobs. Braket support embedded simulators of PennyLane, which is embedded simulator GPU.

（7）Hybrid jobs and Braket provide Hybrid jobs for hybrid algorithms. When creating a job, you can upload the script of the quantum algorithm and other related parameters. The job results can be persisted in s3, or viewed in the console and cloud watch.

3. QuTrunk Introduce

3.1 Quantum Programming Framework
QuTrunk is a free, open source, cross platform quantum computing programming framework independently developed by QUDOOR, including quantum programming API, quantum command translation, quantum computing backend interface, etc.

QuTrunk uses Python as the host language, and with the syntax features of Python to implement DSL (domain specific language) for quantum programs. All IDEs programmed in Python can be installed. QuTrunk provides various APIs required for quantum programming based on quantum logic gates , quantum circuits and so on. These APIs are implemented by corresponding modules. For example, QCircuit implements quantum circuits, Qubit implements qubits, Qureg implements quantum registers, Command implements instructions for each quantum gate operation, Backend implements backend modules for running quantum circuits, and gate modules implement various basic quantum gate operations. In addition, QuTrunk can also be used as the basis for other quantum computing applications, such as quantum algorithms, quantum visual programming, quantum machine learning, etc.

QuSprout now is the backend of QuTrunk. QuSprout is also a quantum computing simulation software developed by QUDOOR based on classical computing resources. It supports multi-threading, multi-node, GPU acceleration, and can also be pre-installed in QuBox. QuTrunk provides a quantum programming framework for quantum programming .In QuTrunk,we have established a unified set of quantum programming specifications

3.2 Structure Resolution
The architecture of QuTrunk is shown in the figure below. Developers can use QuBranch to directly compile Python language. Modular design allows developers to adjust the compiler gate group according to the actual situation to better complete customized design. At present, QuTrunk uses BackendLocal as the backend by default, and provides a Python version of the local backend.

At the same time, QuTrunk also supports extending more backends.Theoretically, as long as the backend of the quantum computing access interface is developed externally, QuTrunk can be compatible,for example, IBM, lonq and so on. Developers can use QuTrunk to write quantum programs, then use QuSL to translate the quantum programs into instructions for the target platform, and then choose different backends for computation.

4.The usage example of QuTrunk quantum programming framework

4.1 Start your first quantum program with Qutrunk
After the deployment of QuTrunk, we can start to write and run our first quantum computing program.

4.1.1 Switch Python interpreter
Executectrl+shift+pin Windows and Linux, and command+shift+p in MacOS to open the command line.And then enter quan search, and select quan: python interpreter switch. Switching the python interpreter can switch the whole workspace or a single project, that is, switch the python environment.

4.1.2 Create a new project
Create a new project under the directory to be saved,such as Qun-Demo. In the IDE start interface, select Open Folder, and then select the newly created folder , the display is as follows:

4.1.3 Compilation and operation of demo program for quantum computing
From the start interface, select the new python file and save it as demo.py. The following code is an example of the bell_pair algorithm:
Step 1: Environment preparation

python
    from qutrunk.circuit import QCircuit
from qutrunk.circuit.gates import H, CNOT, Measure

Quantum circuit: The core.circuit in the above code module provides the quantum circuit function.
Quantum logic gate operation: from qutrunk. circuit Gates import H, CNOT, Measure can prepare H gates, CNOT gates and measurement operations.

Step 2: Initialize the quantum circuit and allocate quantum registers

    qc = QCircuit()
qr = qc.allocate(2)```
{% endraw %}
python
{% raw %}

Step 3: Quantum logic gate operation


python
    H * qr[0]
    CNOT * (qr[0], qr[1])
    Measure * qr[0]
    Measure * qr[1]

    # print circuit
    qc.print()

Print circuit: After the code of the quantum circuit is edited, print the quantum circuit directly. (Use the instructionPrinter.+ to customize the function.)
According to the prediction result of logic gate operation: ｜00〉+｜11〉

Step 4: Bell Pair quantum circuit operation and result


python
    # run circuit
    res = qc.run(shots=1024)

    # print result
    print(res.get_measure())
print(res.get_counts())

At present, QuTrunk can directly output the quantum circuit diagram by running the quantum circuit.

Step 5: Output the operation results


python
   *qreg q[2]  
    creg c[2]  
    H * q[0]  
    MCX(1) * (q[0], q[1])  
    Measure * q[0]  
    Measure * q[1]  
[{"00": 505}, {"11": 519}]*

Define quantum register: The above defined quantum register is completed by qreg operation. The instruction qregq [2]defines a 2-bit quantum register named q; Similarly, the instruction creg c [2] defines a classical register named c.
Measurement result: the measurement result is [{"00": 505}, {"11": 519}]. The result shows that the quantum circuit runs 505 and 519 times respectively, 1024 times in total. Run the above program several times, and the results are different. For example, the second and third run results are [{"00": 507}, {"11": 517}], [{"00": 528}, {"11": 496}], respectively.

4.2 Visual Programming
4.2.1 Initialize quantum programming workspace
Open QuBranch, first initialize the quantum programming workspace: through ctrl+shift+p on Windows, command+shift+p on macOS to open the command line, enter quan.And then select quan:初始化量子编程 and execute initialization. Start the function of initializing the visual programming workspace, which will create a virtual workspace for the developers to program visually.

4.2.2 Start visual quantum programming
Press ctrl+shift+p/comand+shift+p again and enter quan. You will see the workforce of visual quantum programming.Select "quan:quan:量子可视化编程 " to start the visualization function of quantum programming. This function allows users to generate multiple qdoor files for visual programming, and if you switch different qdoor files, you can get different quantum circuit diagrams. It also allows you to edit circuit diagrams by coding or by dragging and dropping quantum gate symbols. Quantum circuit diagrams can be generated when a quantum gate is dragged into a circuit diagram, and removed by dragging a quantum gate from a circuit diagram.. Currently, H, NOT, Sdg, Toffoli, Tdg are supported , X, Y, Z, P, Rx, Ry, Rz, R, SqrtX, T, Measure, where X, Y, Z, P, Rx, Ry, Rz allow adding a control bit, Rx, Ry, Rz allow changing the rotation angle. QuBranch also provides keyword highlighting, code prompt and code auto-completion and other functions. With the help of Qutrunk, developers can check the quantum state statistics of the current quantum circuit diagram.

4.2.3 Start visual programming example
In the visual programming workforce, the developers can select each graphical programming element to add or delete to complete the programming by dragging and dropping them. If delete an element, just take the element out of the window and release it. The visual programming example is as follows:

5. Summary

With the continuous development of quantum technology, the quantum programming framework is constantly being improved, making it easier to use and better compatible. Through the introduction above, we can see that the quantum programming framework provides a programming environment for developers, simplifies many tedious basic steps, and makes the complexity of the underlying technology no longer a problem for users, so that more people can participate in the research and development of quantum technology, which is conducive to promoting the development of quantum technology.