DEV Community

Neeraj Iyer
Neeraj Iyer

Posted on

Amazon Quicksight - Best practices Part 2

This is a part 2 of Amazon Quicksight best practices that a organization should follow to be consistent across all teams and follow a same process of building and deploying Quicksight assets

Link to part 1 - Part 1

Data preparation
Data preparation is a very critical process for creating accurate and insightful analyses in Quicksight.
Some best practices that you should follow while preparing your data for analysis are:

Query mode -
Ensure you have a data arriving from your data source either through a direct or SPICE based query mode. If you need to view your data instantly then consider using direct query. But if you are looking to use complex calculations and you are okay to cache your data then consider using SPICE mode. You will have to refresh your data manually or can set up a automated trigger to refresh your data in frequent intervals.

Have an understanding of your data and optimize it:
Understand your data well and if you need to optimize it for performance do it. If you do not want to get all the data, you can filter out the data in source and get only the required data in. Consider pre aggregating (sums, averages, max, min) your data at the source itself which can reduce the data volume.

Joining tables and establishing relationships
If you have to join multiple tables, make sure you establish appropriate relationships between them. Use the correct join type (inner, full, left, right) based on the relationship between the datasets. Prefer to use joins instead of cartesian products as it will lead to performance issues.

Creating data dictionaries and process flow:
It’s a good practice to document the entire process flow using tools like Lucid chart and also document all the fields, sources ,relationships and different transformations applied in your dataset which will make it easier for others new to the project to understand the data better.
Quicksight One Big Table (OBT) data model – Quicksight follows a One Big Table data model approach and not a star schema approach. Quicksight has more features in their future roadmap where they can join multiple facts.
Ensure data is clean and normalized – Make sure if you need to standardize date, address formats, remove duplicates from your data, or handle missing values.
Apply row-level or column level security to your data – If you have to restrict your data by users then you can either apply Row-level security while preparing data or after you have a dataset created.

Apply your data transformations in Quicksight itself or in your data layer itself - best practice is decide based on your use case and data whether your data will require complex transformations or basic transformations. Performance is an important consideration here. Some transformations will not perform well in Quicksight while comparing it with a transformation in the data layer. For complex ETL tasks you can consider using AWS Glue before bringing the data into Quicksight. If you are comfortable using SQL and think that your transformations are simple, then you can use customer SQL queries within Quicksight to filter, join and transform data as it is imported. So, make a choice of where the transformation should be done based on your data, performance consideration.

Sharing analyses/dashboards/datasets:
Sharing analyses, dashboards & datasets in Amazon Quicksight is straightforward and Quicksight ensures you share your resources with right people who should be having access to that specific data which maintains security of data.
Some best practices that should be followed while sharing analyses/dashboards are:

Place your artifacts in folders
Organize your artifacts like analyses, dashboards and datasets into folders or shared folders so that is easy to share it with other users who are also using the same artifacts. Name your folder appropriately so that it is easily recognizable by other team members.

Provide appropriate access permissions
You can grant a viewer or co-owner permission with your users. Based on the use case and user role provide a viewer or co-owner permission. Co-owner permission should be provided to the least number of users so that not everyone is able to modify the artifacts.

Utilize groups to share access
Instead of sharing your artifacts with individual users you can share it with a group which makes it easier on management of users. Once you have grouped them you can share access to your resources or artifacts easily.

Utilize Quicksight embedding feature
If you want to share access to users within your application, then you can embed your dashboard with your web application or portal. This is secure and follows security best practices.

Follow naming conventions
As mentioned previously, follow naming conventions for folders, groups and file naming for all artifacts.

Sharing dashboard using links
If you are sharing your dashboards to external users via link, then make sure you are sharing it with the right person. They will not have access to the data as they will be readers of the dashboard, but you might want to restrict who you want your dashboards visible to.

Utilize email reporting
Quicksight has a email reporting feature where you can automatically send dashboards or schedule to send snapshots of the dashboards. You can share this with all the users who are intended to view the dashboard or underlying data.

Utilize Quicksight’s recent and favorites options
Users can add their dashboards/analyses to favorites from where they can easily access their resources back again. However, It is always better to use folders/shared folders to share it among different users.
**
Audit resource permissions periodically-**
Make sure your resources are audited frequently. There might be some stale analysis/dashboards that is not being used anymore or was created for testing purposes. That is why it is encouraged to name the resources appropriately so that it is easy to either keep it as is or purge if not required. If there are users in the account that have left the organization, make sure their resources are shared among other users and that user is deleted.

Security best practices
Amazon Quicksight provides several security features that will not only protect your data but also maintain confidentiality and integrity while accessing different AWS resources and external resources.

Contact your security administrator to get you access to Quicksight
We have multiple AWS accounts and different accounts have Quicksight accounts. So, make sure you know which project you are working on and appropriately gain access to that account. Once you have access to Microsoft Entra, you should be able to see your AWS account and then navigate to the right account that has your Quicksight account. Ensure what permissions you need as a Quicksight user. Most users will not need administrator access so ensure which permissions you will need based on your role in the project. Your administrator will assign appropriate IAM users/role/policies that you will need to access different resources.

Remember the principle of least privilege
AWS follows a principle of least privilege where you provide your user least privilege to access different resources. In Quicksight you should be provided the least privilege based on your role. Admin access should be limited to users and most of the users should be authors (one who create analysis/dashboards) and rest of the users who will just consume the dashboard- view the dashboard as a user should be readers. Sharing of resources should also follow the same principle of least privilege.

Monitor access using Cloud trail
You can enable AWS CloudTrail to log all API calls that are made by or on behalf of Quicksight users which can help audit user activity.

Use IAM roles to secure data sources
Use IAM roles to access other AWS services like S3, RDS, Redshift, Aurora, Athena. With this you can grant only specific permissions to users without exposing all the permissions.

Rotate credentials regularly
You can utilize AWS Secrets managers to securely store and rotate credentials. Rotate access credentials regularly such as IAM keys, database credentials, API keys that will be used in Quicksight.

Implement Row-level security
If you want to restrict access to your dataset to specific users or group, then you can apply row-level security. With this restriction only users authorized to view the data can view.

Enable multi-factor authentication
Have your users use multi-factor authentication for an additional layer of authentication or security when trying to access Quicksight.

Monitoring using CloudWatch
Use Amazon CloudWatch to monitor the performance, health and usage of Quicksight. If you need to set up some metrics or alarms you can do so with CloudWatch to detect any anomalies or security breaches.

Private connections to your database through VPC
You can use Virtual Private Cloud (VPC) to access private data sources from Quicksight without exposing your data to public internet. To enhance security, you can establish VPC endpoints to connect to data sources like RDS/Redshift in your VPC.

Implement security best practices for embedding
While embedding Quicksight dashboards or the authoring experience to applications, make sure you use a secure session for embedding by generating secure session urls so that you do have unauthorized access to your embedded dashboards. It is always advised to use IAM roles and session based permissions for embedding.

Enable data encryption
If you have a mission critical data that should not be exposed to public then you can encrypt your data using AWS Key management services (KMS) which will encrypt your data at rest as well data in transit.

Networking best practices
While setting up Quicksight or any dataset in Quicksight, networking is a very important aspect that cannot be ignored to ensure that you have secure, efficient and reliable access. Some best practices that should be followed are:

Direct connectivity to data sources
If you have data sources like RDS, Redshift or on-prem databases within Virtual Private Cloud (VPC) then you need to configure Quicksight to access these resources via VPC connections

Private VPC connections
You can use Virtual Private Cloud (VPC) to access private data sources from Quicksight without exposing your data to public internet. To enhance security, you can establish VPC endpoints to connect to data sources like RDS/Redshift in your VPC.

Network Latency and Performance
It is always good practice to connect your Quicksight to data sources in the same AWS region as Quicksight to improve performance and improve latency. You can utilize SPICE to store your data in memory so that network congestion will reduce and improve performance.

Auditing and monitoring
You can enable VPC flow logs to monitor the traffic between Quicksight and your data sources, which will help in identifying network issues. To track your Quicksight API calls you can use AWS CloudTrail and CloudWatch for monitoring and setting up alarms based on metrics you define.

IP Addressing
If you would like to use fixed IP address for firewall configurations then you can refer to this document that provides the AWS IP ranges - https://docs.aws.amazon.com/quicksight/latest/user/regions.html. Based on this - You can regularly update your firewall rules incase there are any changes.

Security Groups
As a rule of thumb you have to follow the principle of lease privilege so you will be allowing only necessary inbound and outbound traffic to and from Quicksight which you configure in your security groups. You must configure security groups to allow Quicksight access to data sources. For example, if you have a Postgres database and want Quicksight to access that database then you must enable port 5432 for Postgres, 5439 for redshift and 3306 for MySQL.

Folder/shared folders for group of users
Using folders or shared folders in Amazon Quicksight is crucial as it helps in organizing and also sharing it with multiple users/groups of users. With this feature you can collaborate with your team.
A folder in Quicksight is for users to organize their resources and shared folders is used to collaborate with multiple users.

Some best practices to follow while using folder or shared folders are:

Shared folder naming conventions
As mentioned earlier please follow a folder naming convention so that it is descriptive enough for other users to follow if you are using shared folders. Some examples are Sales_dashboards, marketing_resources, HR_resources, Insurance_resources.

Another use case of folders is to segregate the resources by different environments. If you have a dev , test and prod environment then you should have 3 folders to place each of your resources in respective folders. If you do not have the ability to create shared folders then reach out to your admin to get one created. By segregating by environments, you will not accidentally push development updates directly to prod.

Permissions to folders
Again, follow the principle of least privilege by providing access to folders for only those who require access to it. Provide users access based on their role such as a viewer or edit permissions

Folder organization
Organize your folders in a clear hierarchy based on your need. If you have data by date and time then organize it accordingly and if you have your resources organized by department, project or data source then organize it accordingly.

Managing a shared folder
Assign a administrator for your shared folder so that they can manage access permissions for all the users who have access to the resource in that shared folder. Also audit the shared folders periodically.

Content organization within folders
You can organize datasets, dashboards and analyses in separate folders so that it becomes easy for a user to access them and work with them.
Archive or purge old content
Periodically ensure that all the resources in the folders are being used, otherwise archive them or delete them after a set time frame of 60/90 days.

Notify team on any updates in shared folders
It is good practice to notify your team members if you made any updates to any Quicksight artifacts and ensure that multiple changes are not made at the same time. Hence it is a good practice to use separate folders for different environments.

Creating Calculated fields
For creating calculated fields in Quicksight, it is necessary to follow some best practices so that your calculations are accurate enough to provide you accurate insight.

Some best practices to follow while creating calculated fields are-

Naming conventions for calculated fields
Use descriptive names for calculated fields which reflect the purpose of the calculation. For example, if you calculate sales for current year to date then name it appropriately as Sales_CYTD or other examples as total_sales, total_marketingcost etc. Once it is created, you can identify a calculated field in your field set as “=”

Define a purpose for your calculation
Before diving into the actual calculation, define the purpose based on your need whether it is a new field that you are creating, will that calculated field be used to filter data or it will transform your existing data. This will make it easier to build the calculations without errors. If you are looking to learn the syntax or get basic understanding of how to create a calculated field you can refer to this link - calculated fields

Improve performance
You should keep your calculations as simple as possible by breaking them down into smaller steps or multiple calculated fields. You can also utilize SPICE (Super-fast, Parallel, In-memory Calculation Engine) for complex calculations which will improve your performance of insights.

Understanding your data
Before creating a calculated field, understand your data, relationships with different tables, data types of different fields. This will help you clean your data if something does not look correct or even help you to find duplicates or standardize date formats etc.

Reuse calculations
If you have a complex calculation, you can create steps of multiple calculated fields which can be re used for any other calculations so that you do not have to create a new calculated field.

Choose your data type
Make sure to choose your correct data type based on existing data. If you need to parse your existing data to string or number or data you can do so using these functions toInt(), toString(), parseDate().

Handling null values
Be very careful while creating calculated fields that have null values. If you want your data to display a 0 when there is null you have to make those calculated field updates using coalesce () or ifelse() so that you do not get an error.

Validate your calculations
It is a good idea to validate your calculations by applying the calculated fields on a small subset of data before you finalize your calculations.

Version control
If you decide to make updates to your calculations it is a good practice to rename the existing calculations with a version number at the end before you finalize so that you do not have to re create incase you have to. An example is total_sales_v1, total_sales_v2

Documentation
It is good practice to document all your calculations in your data dictionary so that you have a reference to all your calculated fields and even if new members join your team.

CICD capabilities- BIOPS capabilities-
Amazon Quicksight does not provide its own version control feature. But the version control and BI operations (BIOPS) can be achieved by using Quicksight CLI’s to manage datasets, analyses and dashboards.

Manual version control
Amazon Quicksight uses CLI commands to backup Quicksight artifacts like datasets, analyses and dashboards. You can use an AWS terminal or command prompt on your local machine to execute the CLI commands.
The Quicksight artifacts are exported as a JSON file that can be stored locally or in a S3 bucket for backup or version control. You can also automate the CLI commands into python/php or any coding language using appropriate libraries and export them out a JSON file. Here is a reference to the Quicksight CLI commands- CLI commands reference .
you can also move our resources from one account to another if you have different environments like Dev, Test & Prod. You can also use Git to store all your historical Quicksight resource if you need to refer to them in the future.
The best practice here is that all Quicksight artifacts should be backed up locally or in a S3 bucket periodically so that you have a backup that you can always rollback to.

Naming conventions for manual version control
Use consistent naming conventions for analyses, dashboards and datasets so that it gets easier to track them. Append date, version and environment to the end of the file name.

Example – File name_VERSION_DATE_ENVIRONMENT

Maintain a track log of all changes that will track all the updates made in every change with the owner of that change. If you have an older version of Quicksight analyses/dashboard, then you can add it to the _Archive folder. If in case in the future, you want to replicate to rollback to an older version you can do so using the archived version.

Automating using CI/CD pipelines
If you have a large project with multiple Quicksight accounts and multiple Quicksight resources, then you can utilize different AWS services like Lambda, CloudFormation, CodePipeline , S3 to automate the deployment of Quicksight resources and also version them.

SPICE dataset version control
You can take a snapshot of your SPICE point in time dataset before you manually refresh or schedule the refresh if you need to back your SPICE data for a particular time frame. You can use this snapshot to show data in your dashboard for a particular time frame if you have such a use case.

User Acceptance Testing
If you have implemented a deployment pipeline to deploy your resource, then you should validate the resources before they are pushed into a different environment.

Different environments and access controls
If you have multiple Quicksight accounts for dev, test and prod environments then not every user needs access to all the environments. Restrict user access to specific environments using role based access and prod environment can have a read only access to view the dashboards for normal users who are not the developers of the dashboard.

Documentation
Make sure the process that is followed by your team is documented and communicated to all the team members so that they are aware of the process.

Disaster recovery plan
Always make sure to backup your resources periodically and in case of a disaster (accidental deletion or other reasons) you always should be ready to restore your resources from your backup that can be S3 or Git or any version control that you might use.

Peer reviews
Implement a peer review process for reviewing changes that are critical and before being pushed to production environment which ensure multiple people have reviewed it before finalizing it.

Creating Visuals in Amazon Quicksight
Creating visuals is the most important purpose of any dashboard which will derive useful insights for users or your audience. Some best practices that you should follow while creating visuals in Quicksight are

Know the purpose of your user
Different users will require data insights at different granularities. So, understand your users and select appropriate visuals. An executive will look to seek only high-level insight vs analyst/developer would look to get more detailed insights.

Pick the correct visual type
After understanding your data, you should pick the correct visual type for your data. Quicksight offers different visuals and it can be found here- Quicksight visuals
You can even use autograph where Quicksight recommends a graph for you based on your data. Make sure to after plotting a graph your data is clearly visible and it makes sense to you to derive useful insights.
You can choose your graphs based on your requirements.

  • To compare different categories of data then you can use bar or column charts.
  • To show correlation between your data points you can use scatter plots.
  • To show geographic data you can use geospatial maps
  • To show data density you can use heat maps
  • To show trends over time you can use line charts.
  • To show a fraction of the whole data you can use donut or pie charts.
  • Title of graph

Use descriptive and clear titles and labels for each graph. Data labels should be clearly visible so make sure the data colors you choose are in contrast with text color and data labels. The title and labels should not overlap and look clunky. It should be easily readable to the user.

Filters
If you are applying filters for your data, then make sure they are placed on the top and clearly visible. They should not be ignored because of the color or color contrasts. For more best practices on filters refer to the section in this document which describes best practices for efficient filtering.

Use spacing wisely
After having your graph plotted, eliminate unnecessary labels, text, and titles that you might not need. This will save space for your graph and not make your graph look cluttered. Once you have all your graphs created in 1 tab re arrange them in a logical way so that it gets easier for the reader to know what you are trying to convey. Use themes to apply correct data, text colors and font style. Also make sure you have your borders and layout styled appropriately. If you want to highlight your data or any text do that to display it and draw users’ attention. Place the important visuals on the top left and going towards right.

Improve performance
Calculations- reduce complex, quick response times, limit amount of data, quick load time, filter aggregations data before to reduce data volume that refreshes the dashboard every time.

Use of colors
Apply data colors appropriately so that it looks readable and cluttered with text or without proper color contrast. If you have KPIs in your graph, then color them green to show positive trend and red for negative trends. As mentioned in the themes section, use colors suitable for all types of users including users who may be color blinded. Do not use multiple colors that will confuse the user and have him lose his focus on data insights.

More best practices coming soon!

Top comments (0)