DEV Community

Karim Elkobrossy for AWS Community Builders

Posted on

4 1

Amazon SageMaker GroundTruth

We will review possible methods provided by Amazon GroundTruth to label our data. Amazon GroundTruth is a service within Amazon Sagemaker that labels datasets for further use in building machine learning models. Three options are available when using this service:

  1. Mechanical Turk

  2. Private labelling workforce

  3. Vendor

Mechanical Turk workforce is a team of global, on-demand workers from Amazon that work around the clock on labelling and human review tasks. Your data should be free of any personally identifiable information (PII) as this is a public workforce. You should use this workforce if you want to save time on the labelling work which anyone could do and if there are no PII within your data.

Private labelling workforce is a team of workers which you choose. They could be employees of your company or a group of subject matter experts. For example, if you have a dataset containing X-ray images and you want to classify those images whether they contain a certain disease or not. Another situation is when your data contains PII, and you want a private workforce to label them.

Vendor workforce is a selection of experienced vendors who specialize in providing data labelling services. They could be found at the AWS marketplace.

Let us now take a look at the different types of labelling jobs available for the image data type:

Images

  1. Image classification (Single label) Image description

In this task, the employees are categorising images into individual classes (1 class per image).
In this example we are either choosing Basketball OR Soccer as a label for this image.

  1. Image classification (Multi-label) Image description

In this task, the employees are categorising images into one or more classes.
In this example we are choosing ALL labels present within the image.

  1. Bounding box Image description

In this task, the employees should draw bounding boxes around specified objects in the images.
In this example we want to specify the location of the birds within the image by drawing bounding boxes which surrounds them.

  1. Semantic segmentation Image description

In this task, the employees should draw pixel level labels around specific objects and segments in the image.
In this example we are classifying EACH PIXEL within the image. So you can see that the pixels of the plane are coloured in red and the rest are in black.

  1. Label verification Image description

In this task, the employees should verify existing labels in the dataset. This could be used to check prior work by human workers or automated labeling jobs.
In this example we want to verify the car's label as being correct or incorrect.

Heroku

Deploy with ease. Manage efficiently. Scale faster.

Leave the infrastructure headaches to us, while you focus on pushing boundaries, realizing your vision, and making a lasting impression on your users.

Get Started

Top comments (0)

Create a simple OTP system with AWS Serverless cover image

Create a simple OTP system with AWS Serverless

Implement a One Time Password (OTP) system with AWS Serverless services including Lambda, API Gateway, DynamoDB, Simple Email Service (SES), and Amplify Web Hosting using VueJS for the frontend.

Read full post

👋 Kindness is contagious

Engage with a wealth of insights in this thoughtful article, valued within the supportive DEV Community. Coders of every background are welcome to join in and add to our collective wisdom.

A sincere "thank you" often brightens someone’s day. Share your gratitude in the comments below!

On DEV, the act of sharing knowledge eases our journey and fortifies our community ties. Found value in this? A quick thank you to the author can make a significant impact.

Okay