The following tables include the performance measures (as explained in the overview page) both for the supplied baseline systems and for the systems submitted by the participants. The performance measures are: mean F-measure for the samples (MF-samples); mean F-measure for the concepts (MF-concepts); and the mean average precision for the samples (MAP-samples).

In the tables, the values between the square brackets correspond to the 95% confidence intervals computed using Wilson's method.

At the bottom of the page you can also download files containing more complete raw results computed for the baseline systems and the participants' submissions.

Test set baseline results

MF-samples
(%)
MF-concepts
(%)
MAP-samples
(%)
baseline_oppsift 16.7   [16.4--17.1] 9.8   [8.6--12.4] 20.2   [19.8--20.6]
baseline_rgbsift 16.6   [16.3--16.9] 9.7   [8.6--12.4] 20.0   [19.6--20.4]
baseline_csift 16.7   [16.4--17.1] 8.5   [7.5--11.0] 20.4   [20.0--20.8]
baseline_sift 16.5   [16.2--16.9] 9.2   [8.1--11.8] 19.9   [19.5--20.3]
baseline_colorhist 15.7   [15.4--16.1] 7.0   [6.2--9.5] 19.2   [18.8--19.6]
baseline_gist2 15.0   [14.7--15.4] 5.7   [5.0--8.0] 17.8   [17.5--18.2]
baseline_getlf 14.9   [14.6--15.2] 5.3   [4.6--7.7] 18.2   [17.8--18.6]
baseline_rand 3.5   [3.3--3.7] 2.6   [2.5--4.5] 8.8   [8.6--9.0]

Test set participants' results

Group and Run # MF-samples
(%)
MF-concepts
(%)
MAP-samples
(%)
KDEVIR_09 37.7   [37.0--38.5] 54.7   [50.9--58.3] 36.8   [36.1--37.5]
KDEVIR_08 37.5   [36.7--38.3] 54.8   [51.0--58.3] 36.5   [35.8--37.2]
MIL_03 27.5   [27.0--28.0] 34.7   [32.5--37.4] 36.9   [36.4--37.5]
KDEVIR_03 34.6   [34.1--35.2] 25.9   [24.3--28.3] 35.0   [34.4--35.6]
KDEVIR_04 34.0   [33.5--34.6] 25.7   [24.2--28.1] 34.8   [34.2--35.4]
KDEVIR_10 34.2   [33.7--34.7] 25.1   [23.6--27.5] 35.2   [34.6--35.8]
MIL_02 26.5   [26.0--27.0] 32.3   [30.3--34.9] 35.8   [35.2--36.3]
MindLab_01 25.8   [25.2--26.3] 30.7   [28.2--34.0] 37.0   [36.4--37.6]
KDEVIR_06 33.8   [33.3--34.3] 24.4   [22.9--26.8] 34.5   [33.9--35.1]
MindLab_02 24.8   [24.2--25.3] 31.7   [29.2--34.8] 37.0   [36.4--37.6]
KDEVIR_05 33.6   [33.1--34.1] 24.1   [22.5--26.6] 33.2   [32.6--33.8]
KDEVIR_02 32.8   [32.3--33.3] 22.9   [21.5--25.3] 33.0   [32.4--33.6]
MLIA_09 24.8   [24.3--25.4] 33.2   [30.7--36.4] 27.8   [27.3--28.4]
DISA-MU_04 29.7   [29.2--30.3] 19.1   [17.5--21.8] 34.3   [33.8--35.0]
MLIA_10 24.8   [24.3--25.4] 33.2   [30.7--36.4] 27.9   [27.3--28.4]
RUC_05 31.1   [30.7--31.6] 25.0   [22.9--28.0] 27.5   [27.0--28.1]
RUC_07 29.3   [28.9--29.8] 25.3   [23.4--28.1] 27.5   [27.0--28.1]
DISA-MU_05 28.4   [27.9--29.0] 20.3   [18.8--23.0] 32.3   [31.7--32.9]
DISA-MU_03 28.5   [28.0--29.1] 18.9   [17.4--21.6] 32.9   [32.3--33.5]
RUC_02 27.8   [27.4--28.3] 24.1   [22.3--26.8] 30.2   [29.7--30.8]
RUC_06 29.0   [28.5--29.4] 25.2   [23.3--28.0] 27.5   [27.0--28.1]
MLIA_08 24.6   [24.1--25.2] 33.3   [30.7--36.4] 27.4   [26.9--28.0]
RUC_01 28.0   [27.5--28.4] 24.1   [22.3--26.8] 30.2   [29.7--30.8]
RUC_03 27.9   [27.5--28.4] 24.0   [22.3--26.8] 30.2   [29.7--30.8]
MIL_01 24.0   [23.6--24.6] 30.1   [28.2--32.7] 31.9   [31.3--32.5]
MLIA_03 24.5   [24.0--25.0] 32.2   [29.7--35.4] 27.6   [27.0--28.2]
MLIA_07 24.4   [23.9--25.0] 33.5   [30.9--36.7] 26.9   [26.3--27.4]
MLIA_04 24.5   [24.0--25.0] 32.2   [29.7--35.4] 27.6   [27.1--28.2]
MLIA_06 24.1   [23.5--24.6] 33.6   [30.9--36.9] 26.3   [25.8--26.9]
DISA-MU_01 27.9   [27.4--28.5] 15.4   [14.0--18.1] 31.6   [31.0--32.2]
MLIA_02 24.4   [23.8--24.9] 32.4   [29.9--35.6] 27.2   [26.7--27.8]
MLIA_01 24.2   [23.6--24.7] 32.7   [30.1--35.9] 26.8   [26.2--27.4]
DISA-MU_02 27.5   [27.0--28.1] 15.3   [14.0--18.0] 31.9   [31.3--32.5]
KDEVIR_07 22.9   [22.3--23.5] 48.7   [44.6--52.8] 23.9   [23.4--24.4]
MLIA_05 23.9   [23.3--24.4] 32.8   [30.2--36.1] 26.1   [25.6--26.7]
RUC_04 21.9   [21.5--22.3] 21.9   [20.3--24.5] 30.2   [29.7--30.8]
RUC_08 20.6   [20.2--21.1] 21.5   [20.3--23.8] 27.5   [27.0--28.1]
IPL_09 18.4   [18.1--18.8] 15.8   [14.2--18.7] 23.4   [22.9--23.9]
IPL_08 18.4   [18.0--18.7] 15.7   [14.1--18.6] 23.4   [22.9--23.9]
IPL_10 18.3   [18.0--18.7] 15.5   [14.0--18.3] 23.4   [22.9--23.9]
IPL_04 18.9   [18.5--19.3] 13.3   [12.0--15.9] 22.5   [22.0--23.0]
IPL_03 18.7   [18.3--19.1] 13.3   [12.0--15.9] 22.4   [21.9--22.9]
IPL_05 18.8   [18.4--19.2] 13.0   [11.7--15.6] 22.4   [22.0--22.9]
IPL_02 18.6   [18.3--19.0] 12.4   [11.2--15.1] 22.1   [21.6--22.6]
IPL_07 17.7   [17.4--18.1] 13.4   [12.1--16.1] 22.0   [21.5--22.5]
IMC-FU_01 16.3   [15.9--16.7] 12.5   [11.4--15.0] 25.1   [24.6--25.7]
IMC-FU_02 16.3   [15.9--16.7] 12.5   [11.4--15.0] 25.1   [24.6--25.7]
IPL_01 18.5   [18.2--18.9] 12.1   [10.8--14.8] 21.9   [21.4--22.4]
IPL_06 17.3   [17.0--17.7] 12.0   [10.8--14.7] 21.3   [20.9--21.8]
INAOE_05 5.3   [5.1--5.6] 10.3   [8.9--13.1] 9.6   [9.4--9.8]
INAOE_02 5.9   [5.7--6.1] 9.2   [8.2--11.7] 9.6   [9.4--9.8]
INAOE_06 5.3   [5.1--5.6] 10.2   [9.0--13.0] 9.3   [9.1--9.5]
INAOE_03 6.2   [6.0--6.4] 9.1   [8.1--11.6] 9.3   [9.1--9.5]
INAOE_04 4.2   [4.0--4.5] 9.3   [8.3--11.9] 9.3   [9.1--9.6]
NII_01 13.0   [12.7--13.3] 2.3   [1.9--4.5] 14.7   [14.4--15.0]
INAOE_01 4.9   [4.7--5.2] 8.3   [7.5--10.6] 9.3   [9.2--9.6]
FINKI_01 7.2   [7.0--7.3] 4.7   [4.2--7.0] 6.9   [6.8--7.1]
KDEVIR_01 4.4   [4.2--4.7] 3.0   [2.9--4.9] 8.6   [8.4--8.8]

Raw results

iclef14annot_results_baseline.zip
iclef14annot_results_DISA-MU.zip
iclef14annot_results_FINKI.zip
iclef14annot_results_IMC-FU.zip
iclef14annot_results_INAOE.zip
iclef14annot_results_IPL.zip
iclef14annot_results_KDEVIR.zip
iclef14annot_results_MIL.zip
iclef14annot_results_MindLab.zip
iclef14annot_results_MLIA.zip
iclef14annot_results_NII.zip
iclef14annot_results_RUC.zip

Raw format description:

The results are text files in which each line corresponds to a type of performance measure. The first column is an identifier of the run, being of the form {GROUP}_{RUN#}, and the second column indicates the type of performance measure. The performances of type {PREC,RECL,F,AP}{samp,cnpt}-te, i.e., all of the ones that do not start with an 'm', are the precision, recall, f-measure or average precision for each sample or concept of the test set. The performances that start with 'm', are the mean of each corresponding measure, being the first value the actual mean, and the following two values the lower and upper limits of the 95% confidence intervals.