WVU
MULTI-VIEW ACTION RECOGNITION DATASET
As
part of our research on real-time multi-view human action recognition in a
camera network, we collected data of subjects performing several actions from different
views using a network of 8 embedded cameras. This data could be potentially
useful for related research on activity recognition. Hence we have made the
data available for download. There are two different datasets.
Dataset 1:
This dataset was used to evaluate recognition of unit actions – each sample consists
of a subject performing only one action, the start and end times for each
action are known, and the input provided is exactly equal to the duration of an
action. The subject performs a set of 12 actions at approximately the same
pace. The data was collected at a rate of 20 fps with 640 x 480 resolution. To download this data-set click here.
Dataset 2:
This dataset was used for evaluating interleaved sequences of actions. Each
sequence consists of multiple unit actions and each unit actions may be of
varying duration. The data was collected at a rate of 20 fps with 960 x 720 resolution. To download this data-set click here.
The
following is a description of how the data was acquired.
Description:
The
multi-camera network system consists of 8 cameras that provide completely
overlapping coverage of a rectangular region R (about 50 x 50 feet) from
different viewing directions. Also the relative orientations of the cameras are
known. The system is show in the following figure.
The subject can be at any location within
this region such that each camera is able to capture the complete image of the
subject. We define view-angle of a camera with respect to an action being
performed as the angle made by the optical axis of the camera with the
direction along which the subject performs the action sequence. View-angle is
measured in the clockwise direction from the ray originating at the subject
location that is parallel to the optical axis of the camera as shown in the
following figure.
We
divide the view-angle range of $0-360^o$ into 8 different sets by considering
that different instances of the same action captured with small view-angle
separations are likely to appear similar. The 8 view-angle sets are denoted as Vj, (1 <= j <=
8$) and are illustrated in the following figure for camera C. For example, when
the subject is facing the region between rays ZA and ZB, the camera C provides
view V2.
Download Dataset 1:
The
subject performs a set of 12 actions at approximately the same pace. The data
was collected at a rate of 20 fps with 640 x 480 resolution.
The following is the list of actions. [Note that our ICDSC paper reports
results on 10 of these actions. Nodding and Throwing are not included].
Action/Event
# |
Action
Description |
Event
1 |
Standing
Still |
Event
2 |
Nodding
head |
Event
3 |
Clapping |
Event
4 |
Waving
1 hand |
Event
5 |
Waving
2 hands |
Event
6 |
Punching |
Event
7 |
Jogging |
Event
8 |
Jumping
Jack |
Event
9 |
Kicking |
Event
10 |
Picking |
Event
11 |
Throwing |
Event
12 |
Bowling |
Each
event has 65 video samples (provided as part 1, part 2 and part 3) and each
sample consists of approximately a 3 second video (with 71 frames in each video) from 8 different views labeled V1 to V8
as per the definitions provided above. For instance, V1 represents
facing the camera and V5 represents facing away from the camera. In
part 1 and part 2 videos, the subject is always at the center of the room. In
samples from part 3, different test subjects are at different locations in the
room.
The
data-set can be downloaded from the following links.
1.
Event
1 (Standing Still): This action consists of a subject standing still in a
controlled environment.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
2.
Event
2 (nodding head): This action consists of a subject nodding his/her head.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
3.
Event
3 ( clapping ): This action consists of a subject
clapping.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
4.
Event
4 ( waving 1 hand ): This action consists of a subject
waving his/her hand.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
5.
Event
5 ( waving 2 hands ): This action consists of a
subject waving both of his/her hands.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
6.
Event
6 ( punch ): This action consists of a subject
punching with one/both hands.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
7.
Event
7 ( jogging ): This action consists of a subject
jogging.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
8. Event 8 ( jumping
jack ): This action consists of a subject performing jumping jack.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
9.
Event
9 ( kicking ): This action consists of a subject
kicking with his/her leg.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
10.
Event
10 ( pick up ): This action consists of a subject
picking up something from the ground.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
11.
Event
11 ( throwing ): This action consists of a subject
performing a throwing action.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
12.
Event
12 ( bowling ): This action consists of a subject
performing bowling action.
You can download this action set by clicking here for
Part1.
Part2.
Part3.
Sample Image
Background Videos for each of the 8
views can be found here.
An
activity or an action sequence performed by a subject consists of an
interleaved sequence of unit actions performed by the subject that are chosen
from the following set of 9 actions in any (unknown) orientation with respect
to the cameras: clapping hands, waving one arm (left or right), waving two
arms, punching, jogging in place, jumping in place, kicking, bending and
underarm bowling. Each action may be of different duration. In between two
consecutive actions from this set, there may or may not be a pause where the
subject does nothing (i.e., simply stands still) or performs random movements
that do not belong in the set.
The
data has been sorted based on the 8 views. For each view, action sequences
performed by different subjects are provided. Data from all cameras is
synchronized in time.
Data
from: View 1, View
2, View 3, View
4, View 5, View
6, View 7, View
8
A
Readme file for the dataset along with metadata for interpreting the dataset is
here.
If
you have any questions, comments regarding the data-set please contact: Dr. Vinod Kulathumani