Jimmy Guerrero for Voxel51

Posted on Oct 25, 2024 • Edited on Nov 15, 2024 • Originally published at voxel51.com

Recapping the AI, Machine Learning and Computer Meetup — October 24, 2024

#computervision #ai #machinelearning #datascience

We just wrapped up the October ‘24 AI, Machine Learning and Computer Vision Meetup, and if you missed it or want to revisit it, here’s a recap! In this blog post you’ll find the playback recordings, highlights from the presentations and Q&A, as well as the upcoming Meetup schedule so that you can join us at a future event.

Accelerating Machine Learning Research and Development for Autonomy

At Oxa (Autonomous Vehicle Software), we designed an automated workflow for building machine vision models at scale from data collection to in-vehicle deployment, involving a number of steps, such as, intelligent route planning to maximise visual diversity; sampling of the sensor data w.r.t. visual and semantic uniqueness; language-driven automated annotation tools and multi-modal search engine; and sensor data expansion using generative methods.

Speaker: Guillaume Rochette is a Staff Engineer at Oxa MetaDriver, a suite of tools that combines generative AI, digital twins and simulation to accelerate machine learning and testing of self-driving technology before and during real-world use. Prior to that, he did a PhD. in Machine Vision at the University of Surrey on “Pose Estimation and Novel View Synthesis of Humans”. He is currently working on Machine Vision and 3D Geometric Understanding for autonomous driving.

Q&A

What sort of hardware is running in the car?
Can you explain how the embeddings are done for these images?
What tools are you using for labeling?
Do you have any stats of the overall dataset you have collected that you can share? Percentage in cities, with traffic lights etc?
Do you consider human perception metrics for identifying objects?
How does the model behave when changing the trained model from urban data into more hybrid terrain, like farm fields?
What kind of homography techniques you used for locating and detecting objects especially in occlusion cases?
While collecting the data and training are the models prone to data drift?
What challenges arise when expanding sensors or adding new data streams for OOD scenarios?

Resource Links

Oxa MetaDriver - Sensor Expansion

Pixels Are All You Need: Utilizing 2D Image Representations in Applied Robotics

Many vision-based robot control applications (like those in manufacturing) require 3D estimates of task-relevant objects, which can be realized by training a direct 3D object detection model. However, obtaining 3D annotation for a specific application is expensive relative to 2D object representations like segmentation masks or bounding boxes.

In this talk, Brent will describe how we achieve mobile robot manipulation using inexpensive pixel-based object representations combined with known 3D environmental constraints and robot kinematics. He will also discuss how recent Visual AI developments show promise to further reduce the cost of 2D training data, thereby increasing the practicality of pixel-based objects representations in robotics.

Speaker: Brent Griffin, PhD is a Principal Machine Learning Scientist at Voxel51. Previously, he was the Perception Lead at Agility Robotics and an assistant research scientist at the University of Michigan conducting research at the intersection of computer vision, control, and robot learning. He is lead author on publications in all of the top IEEE conferences for computer vision, robotics, and control, and his work has been featured in Popular Science, in IEEE Spectrum, and on the Big Ten Network.

Q&A

If a 4th channel is lidar, it’s more sparse than the RGB resolution usually. Have you seen this in use?
Are there data or modeling considerations you can mention about handling this "sparse 4th channel" case?
When building the model, while feeding in more information, is there a general concern for overfitting the model and straying away from generalization? With imaging, I used to believe that more data is better to train the model, so in this case, would more information refer to more data in this context, please?
Would it be correct to see the X in RGB-X model as different channels in a CNN model?

Resource Links

Tutorial: Monocular Depth Estimation with FiftyOne
Paper: Sapiens: Foundation for Human Vision Models

PostgreSQL for Innovative Vector Search

There are a plethora of datastores that can work with vector embeddings. You are probably already running one that allows for innovative uses of data alongside your embeddings – PostgreSQL! This talk will focus on showing examples of how features already present in the PostgreSQL ecosystem allow you to leverage it for cutting edge use cases. Live demos and lively discussion will be the focus of the talk. You will go home with the foundation to do more impressive vector similarity searches.

Speaker: Steve Pousty is a dad, partner, son, a founder, and a principal developer advocate at Voxel51. He can teach you about Computer Vision, Data Analysis, Java, Python, PostgreSQL, Microservices, and Kubernetes. He has deep expertise in GIS/Spatial, Remote Sensing, Statistics, and Ecology. Steve has a Ph.D. in Ecology and can be bribed with offers of bird watching or fly fishing.

Resource Links

Demo Repo on GitHub
Slides from the talk

Join the AI, Machine Learning and Computer Vision Meetup!

The goal of the Meetups is to bring together communities of data scientists, machine learning engineers, and open source enthusiasts who want to share and expand their knowledge of AI and complementary technologies.

Join one of the 12 Meetup locations closest to your timezone.

What’s Next?

Up next on Nov 14, 2024 at 10:00 AM PT / 1:00 PM ET, we have three great speakers lined up!

Human-in-the-loop: Practical Lessons for Building Comprehensive AI Systems- Adrian Loy, Merantix Momentum
Curating Excellence: Strategies for Optimizing Visual AI Datasets- Harpreet Sahota, Voxel51
Deploying ML models on Edge Devices using Qualcomm AI Hub- Bhushan Sonawane, Qualcomm

Get Involved!

There are a lot of ways to get involved in the Computer Vision Meetups. Reach out if you identify with any of these:

You’d like to speak at an upcoming Meetup
You have a physical meeting space in one of the Meetup locations and would like to make it available for a Meetup
You’d like to co-organize a Meetup
You’d like to co-sponsor a Meetup

Reach out to Meetup co-organizer Jimmy Guerrero on Meetup.com or ping me over LinkedIn to discuss how to get you plugged in.

—

These Meetups are sponsored by Voxel51, the company behind the open source FiftyOne computer vision toolset. FiftyOne enables data science teams to improve the performance of their computer vision models by helping them curate high quality datasets, evaluate models, find mistakes, visualize embeddings, and get to production faster. It’s easy to get started, in just a few minutes.

DEV Community

Recapping the AI, Machine Learning and Computer Meetup — October 24, 2024

Accelerating Machine Learning Research and Development for Autonomy

Q&A

Resource Links

Pixels Are All You Need: Utilizing 2D Image Representations in Applied Robotics

Q&A

Resource Links

PostgreSQL for Innovative Vector Search

Resource Links

Join the AI, Machine Learning and Computer Vision Meetup!

What’s Next?

Get Involved!

Top comments (0)