Fine Grained Action Recognition of Skateboarding Tricks

Frederik Calsius; Mirela Popa; Alexia Briassouli

Fine Grained Action Recognition of Skateboarding Tricks

Frederik Calsius, Mirela Popa, Alexia Briassouli

Research output: Contribution to conference › Paper › Academic

Abstract

In the field of machine learning, it is common practice to use benchmark datasets to prove the working of a method. The domain of action recognition in videos often uses datasets like Kinet-ics, Something-Something, UCF-101 and HMDB-51 to report results. Considering the properties of the datasets, there are no datasets that focus solely on very short clips (2 to 3 seconds), and on highly-similar fine-grained actions within one specific domain. This paper researches how current state-of-the-art action recognition methods perform on a dataset that consists of highly similar, fine-grained actions. To do so, a dataset of skateboarding tricks was created. The performed analysis highlights both benefits and limitations of state-of-the-art methods, while proposing future research directions in the activity recognition domain. The conducted research shows that the best results are obtained by fusing RGB data with OpenPose data for the Temporal Shift Module.

Original language	English
Publication status	Published - 25 Aug 2021
Event	15. International Conference on Computer Vision and Image Processing- Paris \| World Academy of Science, Engineering and Technology - Paris, France Duration: 30 Dec 2021 → 31 Dec 2021 https://app.qwoted.com/opportunities/event-iccvip-2021-15-international-conference-on-computer-vision-and-image-processing-paris

Conference

Conference	15. International Conference on Computer Vision and Image Processing- Paris \| World Academy of Science, Engineering and Technology
Abbreviated title	ICCVIP 2021
Country/Territory	France
City	Paris
Period	30/12/21 → 31/12/21
Internet address	https://app.qwoted.com/opportunities/event-iccvip-2021-15-international-conference-on-computer-vision-and-image-processing-paris

Cite this

@conference{1fb3929aec6145dc9c6688823fe531a1,

title = "Fine Grained Action Recognition of Skateboarding Tricks",

abstract = "In the field of machine learning, it is common practice to use benchmark datasets to prove the working of a method. The domain of action recognition in videos often uses datasets like Kinet-ics, Something-Something, UCF-101 and HMDB-51 to report results. Considering the properties of the datasets, there are no datasets that focus solely on very short clips (2 to 3 seconds), and on highly-similar fine-grained actions within one specific domain. This paper researches how current state-of-the-art action recognition methods perform on a dataset that consists of highly similar, fine-grained actions. To do so, a dataset of skateboarding tricks was created. The performed analysis highlights both benefits and limitations of state-of-the-art methods, while proposing future research directions in the activity recognition domain. The conducted research shows that the best results are obtained by fusing RGB data with OpenPose data for the Temporal Shift Module. ",

author = "Frederik Calsius and Mirela Popa and Alexia Briassouli",

year = "2021",

month = aug,

day = "25",

language = "English",

note = "15. International Conference on Computer Vision and Image Processing- Paris | World Academy of Science, Engineering and Technology, ICCVIP 2021 ; Conference date: 30-12-2021 Through 31-12-2021",

url = "https://app.qwoted.com/opportunities/event-iccvip-2021-15-international-conference-on-computer-vision-and-image-processing-paris",

}

TY - CONF

T1 - Fine Grained Action Recognition of Skateboarding Tricks

AU - Calsius, Frederik

AU - Popa, Mirela

AU - Briassouli, Alexia

PY - 2021/8/25

Y1 - 2021/8/25

N2 - In the field of machine learning, it is common practice to use benchmark datasets to prove the working of a method. The domain of action recognition in videos often uses datasets like Kinet-ics, Something-Something, UCF-101 and HMDB-51 to report results. Considering the properties of the datasets, there are no datasets that focus solely on very short clips (2 to 3 seconds), and on highly-similar fine-grained actions within one specific domain. This paper researches how current state-of-the-art action recognition methods perform on a dataset that consists of highly similar, fine-grained actions. To do so, a dataset of skateboarding tricks was created. The performed analysis highlights both benefits and limitations of state-of-the-art methods, while proposing future research directions in the activity recognition domain. The conducted research shows that the best results are obtained by fusing RGB data with OpenPose data for the Temporal Shift Module.

AB - In the field of machine learning, it is common practice to use benchmark datasets to prove the working of a method. The domain of action recognition in videos often uses datasets like Kinet-ics, Something-Something, UCF-101 and HMDB-51 to report results. Considering the properties of the datasets, there are no datasets that focus solely on very short clips (2 to 3 seconds), and on highly-similar fine-grained actions within one specific domain. This paper researches how current state-of-the-art action recognition methods perform on a dataset that consists of highly similar, fine-grained actions. To do so, a dataset of skateboarding tricks was created. The performed analysis highlights both benefits and limitations of state-of-the-art methods, while proposing future research directions in the activity recognition domain. The conducted research shows that the best results are obtained by fusing RGB data with OpenPose data for the Temporal Shift Module.

M3 - Paper

T2 - 15. International Conference on Computer Vision and Image Processing- Paris | World Academy of Science, Engineering and Technology

Y2 - 30 December 2021 through 31 December 2021

ER -