DEV Community

Cover image for Towards Good Practices for Very Deep Two-Stream ConvNets
Paperium
Paperium

Posted on • Originally published at paperium.net

Towards Good Practices for Very Deep Two-Stream ConvNets

Make Deep Video Networks Work: Simple Steps to Better Action Recognition

Images got smarter fast with big neural nets, but video kept lagging behind.
Researchers found two main reasons: many video models were not as deep as image ones, and the amount of labeled video is very small, so it’s easy to over-fit.
They tried making the motion models much deeper and used a few careful tricks to make them learn without memorizing.
First, they do pre-training for both the picture and motion parts, so the nets start smarter.
Then they train slowly with smaller steps, add more varied training shots, and use stronger regularization so the net forgets noise and learns real patterns.
They also run training on multiple GPUs to make it faster and fit bigger models.
The result is clear: deeper video models can learn actions much better when trained this way, and accuracy jumps up.
If you like videos or machine learning, this shows that with right care, machines can get a lot better at understanding motion, even from small data sets.

Read article comprehensive review in Paperium.net:
Towards Good Practices for Very Deep Two-Stream ConvNets

🤖 This analysis and review was primarily generated and structured by an AI . The content is provided for informational and quick-review purposes.

Top comments (0)