WVU MULTI-VIEW ACTION RECOGNITION DATASET

As part of our research on real-time multi-view human action recognition in a camera network, we collected data of subjects performing several actions from different views using a network of 8 embedded cameras. This data could be potentially useful for related research on activity recognition. Hence we have made the data available for download. There are two different datasets.

Dataset 1: This dataset was used to evaluate recognition of unit actions – each sample consists of a subject performing only one action, the start and end times for each action are known, and the input provided is exactly equal to the duration of an action. The subject performs a set of 12 actions at approximately the same pace. The data was collected at a rate of 20 fps with 640 x 480 resolution. To download this data-set click here.

Dataset 2: This dataset was used for evaluating interleaved sequences of actions. Each sequence consists of multiple unit actions and each unit actions may be of varying duration. The data was collected at a rate of 20 fps with 960 x 720 resolution. To download this data-set click here.

The following is a description of how the data was acquired.

Description:

The multi-camera network system consists of 8 cameras that provide completely overlapping coverage of a rectangular region R (about 50 x 50 feet) from different viewing directions. Also the relative orientations of the cameras are known. The system is show in the following figure.

The subject can be at any location within this region such that each camera is able to capture the complete image of the subject. We define view-angle of a camera with respect to an action being performed as the angle made by the optical axis of the camera with the direction along which the subject performs the action sequence. View-angle is measured in the clockwise direction from the ray originating at the subject location that is parallel to the optical axis of the camera as shown in the following figure.

We divide the view-angle range of $0-360^o$ into 8 different sets by considering that different instances of the same action captured with small view-angle separations are likely to appear similar. The 8 view-angle sets are denoted as V_j, (1 <= j <= 8$) and are illustrated in the following figure for camera C. For example, when the subject is facing the region between rays ZA and ZB, the camera C provides view V₂.

Download Dataset 1:

The subject performs a set of 12 actions at approximately the same pace. The data was collected at a rate of 20 fps with 640 x 480 resolution. The following is the list of actions. [Note that our ICDSC paper reports results on 10 of these actions. Nodding and Throwing are not included].

Action/Event #	Action Description
Event 1	Standing Still
Event 2	Nodding head
Event 3	Clapping
Event 4	Waving 1 hand
Event 5	Waving 2 hands
Event 6	Punching
Event 7	Jogging
Event 8	Jumping Jack
Event 9	Kicking
Event 10	Picking
Event 11	Throwing
Event 12	Bowling

Each event has 65 video samples (provided as part 1, part 2 and part 3) and each sample consists of approximately a 3 second video (with 71 frames in each video) from 8 different views labeled V₁ to V₈ as per the definitions provided above. For instance, V₁ represents facing the camera and V₅ represents facing away from the camera. In part 1 and part 2 videos, the subject is always at the center of the room. In samples from part 3, different test subjects are at different locations in the room.

The data-set can be downloaded from the following links.

1. Event 1 (Standing Still): This action consists of a subject standing still in a controlled environment.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

2. Event 2 (nodding head): This action consists of a subject nodding his/her head.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

3. Event 3 ( clapping ): This action consists of a subject clapping.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

4. Event 4 ( waving 1 hand ): This action consists of a subject waving his/her hand.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

5. Event 5 ( waving 2 hands ): This action consists of a subject waving both of his/her hands.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

6. Event 6 ( punch ): This action consists of a subject punching with one/both hands.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

7. Event 7 ( jogging ): This action consists of a subject jogging.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

8. Event 8 ( jumping jack ): This action consists of a subject performing jumping jack.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

9. Event 9 ( kicking ): This action consists of a subject kicking with his/her leg.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

10. Event 10 ( pick up ): This action consists of a subject picking up something from the ground.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

11. Event 11 ( throwing ): This action consists of a subject performing a throwing action.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

12. Event 12 ( bowling ): This action consists of a subject performing bowling action.
You can download this action set by clicking here for
Part1.
Part2.
Part3.

Sample Image

Background Videos for each of the 8 views can be found here.

Download Dataset 2:

An activity or an action sequence performed by a subject consists of an interleaved sequence of unit actions performed by the subject that are chosen from the following set of 9 actions in any (unknown) orientation with respect to the cameras: clapping hands, waving one arm (left or right), waving two arms, punching, jogging in place, jumping in place, kicking, bending and underarm bowling. Each action may be of different duration. In between two consecutive actions from this set, there may or may not be a pause where the subject does nothing (i.e., simply stands still) or performs random movements that do not belong in the set.

The data has been sorted based on the 8 views. For each view, action sequences performed by different subjects are provided. Data from all cameras is synchronized in time.

Data from: View 1, View 2, View 3, View 4, View 5, View 6, View 7, View 8

A Readme file for the dataset along with metadata for interpreting the dataset is here.

If you have any questions, comments regarding the data-set please contact: Dr. Vinod Kulathumani