Jimmy Guerrero for Voxel51

Posted on May 9 • Updated on Jun 4 • Originally published at voxel51.com

Recapping the AI, Machine Learning and Data Science Meetup — May 8, 2024

#computervision #machinelearning #datascience #ai

We just wrapped up the May ‘24 AI, Machine Learning and Data Science Meetup, and if you missed it or want to revisit it, here’s a recap! In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.

First, Thanks for Voting for Your Favorite Charity!

In lieu of swag, we gave Meetup attendees the opportunity to help guide a $200 donation to charitable causes. The charity that received the highest number of votes this month was AI4ALL which opens doors to artificial intelligence for historically excluded talent through education and mentorship.. We are sending this event’s charitable donation of $200 to AI4ALL on behalf of the Meetup members!

Missed the Meetup? No problem. Here are playbacks and talk abstracts from the event.

From Research to Industry: Bridging Real-World Applications with Anomalib at the CVPR VAND Challenge

This talk highlights the role of Anomalib, an open-source deep learning framework, in advancing anomaly detection within AI systems, particularly showcased at the upcoming CVPR Visual Anomaly and Novelty Detection (VAND) workshop. Anomalib integrates advanced algorithms and tools to facilitate both academic research and practical applications in sectors like manufacturing, healthcare, and security. It features capabilities such as experiment tracking, model optimization, and scalable deployment solutions. Additionally, the discussion will include Anomalib’s participation in the VAND challenge, focusing on robust real-world applications and few-shot learning for anomaly detection.

Speaker: Samet Akcay, an AI research engineer and a tech lead, specializes in semi/self-supervised, zero/few-shot anomaly detection, and multi-modality. He is recently known for his open-source contributions to the ML/DL community. He is the lead author of anomalib, a major open-source anomaly detection library. He also maintains the OpenVINO Training Extensions, a low-code transfer learning framework for building computer vision models.

Q&A

Will this model work on non-image data set such as card transactions?
Are there situations where anomaly detection problem cannot be framed as a straightforward classification problem?
And how do you tackle such problems where the notion of correctness is not straightforward like in classification setting?
Given the reliance on unsupervised or semi-supervised methods in anomaly detection due to the unknown nature of abnormalities, what are the best practices or techniques to improve the robustness and accuracy of these models, particularly in differentiating between subtle anomalies and normal variations in industrial images?

Resource links

VAND2.0 Challenge at CVPR with Intel
MVTec Anomaly Detection (MVTec AD) anomaly detection benchmark dataset
Anomalib GitHub
OpenVINO Documentation

Anomaly Detection with Anomalib and FiftyOne

Most anomaly detection techniques are unsupervised, meaning that anomaly detection models are trained on unlabeled non-anomalous data. Developing the highest-quality dataset and data pipeline is essential to training robust anomaly detection models.

In this brief walkthrough, I will illustrate how to leverage open-source FiftyOne and Anomalib to build deployment-ready anomaly detection models. First, we will load and visualize the MVTec AD dataset in the FiftyOne App. Next, we will use Albumentations to test out augmentation techniques. We will then train an anomaly detection model with Anomalib and evaluate the model with FiftyOne.

Speaker: Jacob Marks is a Senior Machine Learning Engineer and Researcher at Voxel51, where he leads open source efforts in vector search, semantic search, and generative AI for the FiftyOne data-centric AI toolkit. Prior to joining Voxel51, Jacob worked at Google X, Samsung Research, and Wolfram Research. In a past life, he was a theoretical physicist: in 2022, he completed his Ph.D. at Stanford, where he investigated quantum phases of matter.

Resource links

To Infer or To Defer: Hazy Oracles in Human+AI Collaboration

This talk explores the evolving dynamics of human+AI collaboration, focusing on the concept of the human as a “hazy oracle” rather than an infallible source. It outlines the journey of integrating AI systems more deeply into practical applications through human+AI cooperation, discussing the potential value and challenges. The discussion includes the modeling of interaction errors and the strategic choices between immediate AI inference or seeking additional human input, supported by results from a user study on optimizing these collaborations.

Speaker: Jason Corso is a Professor of Robotics, Electrical Engineering, and Computer Science at the University of Michigan, and Co-Founder / Chief Scientist at AI startup Voxel51. His research spans computer vision, robotics, and AI, with over 150 peer-reviewed publications.

Q&A

In autonomous driving, to avoid collisions, how is the distance between cars measured to maintain a certain gap?
How does the resolution of the original video affect the accuracy of the bounding box and the subsequent pose estimation? Is there a threshold resolution below which the accuracy significantly degrades?
What are the main factors contributing to the superior performance of the CH-CNN model when both keypoint map and class are used, as indicated by its higher accuracy across all vehicle types?
There are differences in performance between the Gaussian fixed attention and uniform fixed attention models. Can you discuss how the type of attention mechanism affects the accuracy of keypoint feature-based models?
How does the deferral function assess the potential improvement in accuracy that can be obtained from new human input? What parameters or data does it require to make this assessment?
If the dual-loss addition error regression is trained on a dataset that differs from test time, doesn't its ability to defer during test time suffer from the same issues with ambiguity / task not being clear?
When performing Id verification based on if two images are identical. What would be your approach?
Is the human collaboration available during inference time or is it just included inside the dataset during training?
Infer or defer seems really useful for allowing robots to learn tasks autonomously and prompting a user for intervention when the robot doesn't know how to do a task. Has there been any attempt to apply infer or defer to robotics?
How to does the robot/system/AI know what actions to take such as what went wrong with the human input and what questions to ask to help it to get to the final inference after the model produced a "defer" output?

Resource links

Prof Corso's website
Check out Prof Corso's weekly Open Office Hours
Code and data from the talk on GitHub
A write up concerning the research presented on Medium

Learning Robot Perception and Control using Vision with Action

To achieve general utility, robots must continue to learn in unstructured environments. In this talk, I describe how our mobile manipulation robot uses vision with action to 1) learn visual control, 2) annotate its own training data, and 3) learn to estimate depth for new objects and the environment. Using these techniques, I describe how I led a small group to win consecutive robot competitions against teams from Stanford, MIT, and other Universities.

Speaker: Brent Griffin, PhD is the Perception Lead at Agility Robotics and was previously an assistant research scientist at the University of Michigan conducting research at the intersection of computer vision, control, and robot learning. He is lead author on publications in all of the top IEEE conferences for computer vision, robotics, and control, and his work has been featured in Popular Science, in IEEE Spectrum, and on the Big Ten Network.

Q&A

Given the segmentation and antipodal grasping approach shown, how does the system handle dynamic environments where object positions or orientations may change rapidly?
Does the segmentation model and the grasping algorithm adapt in real-time, and what are the computational constraints associated with such adaptations?
Is the error from segmentation used in the control loop? Or is it just used for planning?
For the learned visual servo-ing, is the depth used or is it purely RGB?
Can you comment on how robust the robot-supervised learning of segmentation masks are in practice?
What's the difference between the way data is collected for ODMD versus SFM?
How does a learning depth / 3d reconstruction with neural networks compare to traditional ways like Colmap?
When you make the robot gripper to "auto-label" objects, what label does it give to the hammer for instance (since it does not know what a hammer is)?

Resource links

University of Michigan code used for the third HSR Challenge 181213. on GitHub
Paper: Video Object Segmentation-based Visual Servo Control and Object Depth Estimation on a Mobile Robot
Paper: Robot-Supervised Learning for Object Segmentation
Paper: Learning Object Depth from Camera Motion and Video Object Segmentation

Join the AI, Machine Learning and Data Science Meetup!

The combined membership of the Computer Vision and AI, Machine Learning and Data Science Meetups has grown to over 20,000 members! The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies.

Join one of the 12 Meetup locations closest to your timezone.

What’s Next?

Up next on May 30, 5:30 PM to 8:00 PM Pacific we have five great speakers lined up!

Multi-Model Visual Question Answering (VQA) Using UForm Tiny Models with Milvus Vector Database– Christy Bergman - AI Developer Advocate at Zilliz and Ash Vardanian - Founder at Unum Cloud
Lessons Learned fine-tuning Llama2 for Autonomous Agents– Rahul Parundekar, at Founder at A.I. Hero, Inc
Combining Hugging Face Transformer Models and Image Data with FiftyOne - Jacob Marks, PhD - ML Engineer/Researcher at Voxel51
Strategies for Enhancing the Adoption of Open Source Libraries: A Case Study on Albumentations.ai – Vladimir Iglovikov, PhD - Founder and CEO at Albumentations.AI

Get Involved!

There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:

You’d like to speak at an upcoming Meetup
You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
You’d like to co-organize a Meetup
You’d like to co-sponsor a Meetup

Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over LinkedIn to discuss how to get you plugged in.

These Meetups are sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.

DEV Community

Recapping the AI, Machine Learning and Data Science Meetup — May 8, 2024

First, Thanks for Voting for Your Favorite Charity!

From Research to Industry: Bridging Real-World Applications with Anomalib at the CVPR VAND Challenge

Q&A

Resource links

Anomaly Detection with Anomalib and FiftyOne

Resource links

To Infer or To Defer: Hazy Oracles in Human+AI Collaboration

Q&A

Resource links

Learning Robot Perception and Control using Vision with Action

Q&A

Resource links

Join the AI, Machine Learning and Data Science Meetup!

What’s Next?

Get Involved!

Top comments (0)

Read next

Revolutionizing Content Creation: The Role of AI Image Processing in Entertainment

🧑‍💻 How I Built the World's Best NextJS, AI Scheduling App 🤖✨

Getting Familiar with GenAI Applications and Use Cases

Huge FLUX LoRA vs Fine Tuning / DreamBooth Experiments Completed, Moreover Batch Size 1 vs 7