You are here

ImageCLEFsecurity

Motivation

File Forgery Detection (FFD) is a serious problem concerning digital forensics examiners. Fraud or counterfeits are common causes for altering files. Another example is a child predator who hides porn images by altering the image extension and in some cases by changing the image signature. Many proposals have been made to solve this problem and the most promising ones concentrate on the image content. It is also common that anyone who wants to hide any kind of information in plain sight without being perceived to use steganography. Steganography is the practice of concealing a file, message, image or video within another file, message, image, or video. The word steganography combines the Greek words steganos (στεγανός), meaning "covered" and graphein (γράφειν) meaning "writing". The most usual cover medium for hiding data are images.
The objective of the specific task is first to examine if an image has been forged, then if it could also hide a text message, and lastly to retrieve the potential hidden message from the forged stego images.

News

Task description

Competition Scenario
You are a professional digital forensic examiner collaborating with the police, who suspects that there is an ongoing fraud in the Central Bank. After obtaining a court order, police gain access to a suspect’s computer in the bank with the purpose to look for images proving the suspect guilty. However, police suspects that he has managed to change extension and signature of some images, so that they look like pdf files. Additionally, it is highly probable that the suspect has used steganography software to hide messages within some images that could reveal valuable information of his collaborators. Police authorities asks you to:

Task 1: Identify Forged Images
Perform detection of altered (forged) images (both extension and signature) and predict the actual type of the forged file.

Task 2: Identify Stego Images
Identify the altered images that hide steganographic content.

Task 3: Retrieve the Message
Retrieve the hidden messages (text) from the stego images.

Data

The dataset contains 9,000 images and pdfs, divided into 3 sets of 3000 files. Each set of images is used for a specific task. 2,000 files are used for training and 1,000 for testing. All participants have access to the training dataset along with the ground truth. The test set is distributed with the ground truth.
Find more Information and the datasets:
1. https://www.crowdai.org/challenges/imageclef-2019-security-forged-file-d...
2. https://www.crowdai.org/challenges/imageclef-2019-security-stego-image-d...
3. https://www.crowdai.org/challenges/imageclef-2019-security-secret-messag...

Evaluation methodology

For assessing performance, classic metrics are used:
Precision, Recall and F1 for Task 1 and Task 2.
Edit distance for Task 3.

Precision
In pattern recognition, information retrieval and binary classification, precision is the fraction of relevant instances among the retrieved instances.
For the task 1, precision could be defined as the fraction of actual detected altered images among all the images detected as altered:

Precision = nº of actual detected altered images /Total detections of altered images

For the task 2, precision could be defined as the fraction of actual detected images with hidden messages among all the detected images with hidden a message:

Precision= nº of actual detected images with hidden messages /Total detections of altered images with hidden messages

Recall
In pattern recognition, information retrieval and binary classification, recall is the fraction of relevant instances that have been retrieved over the total amount of relevant instances.
For the task 1, recall could be defined as the fraction of actual detected altered images among all the altered images:

Recall = nº of actual detected altered images /Total altered images

For the task 2, recall could be defined as the fraction of actual detected images with hidden messages among all the images with hidden a message:

Recall = nº of actual detected images with hidden messages /Total altered images with hidden messages

F-measure
F-measure is the harmonic mean of precision and recall, mathematically expressed as

F_1=2∙(Precision ∙ Recall)/(Precision + Recall )

Edit distance
Given two strings a and b on an alphabet Σ (e.g. the set of ASCII characters), the edit distance d(a,b) is the minimum-weight series of edit operations (Insertion, Deletion, Substitution) that transforms a into b.

Preliminary Schedule

    20.11.2018: Registration opens
    01.12.2018: Development data release
    18.03.2019: Test data release
    01.05.2019: Deadline for submission of runs by the participants 11:59:59 PM GMT.
    03.05.2019: CLEF Submission of Abstracts of Long and Short Papers.
    13.05.2019: Release of processed results by the task organizers.
    24.05.2019: CLEF Submission of CEUR-WS Participant Papers.
    07.06.2019: Notification of acceptance of the working notes papers.
    28.06.2019: CLEF CEUR-WS Working Notes Camera Ready submission.
    09.-12.09.2019: CLEF 2019, Lugano, Switzerland

Registration

Registration is NOT required for this challenge.

Submission instructions

Please note that each group is allowed for maximum of 10 runs per task.

Task 1: Identify Forged Images
For the submission of the task we expect the following format:
<Figure-ID>;<initial Image type>
e.g.:

1741_01;jpg if the document classified as a forged one, initially jpg file
1742_01;pdf if the document classified as a NO forged one
1743_01;png if the document classified as a forged one, initially png file

You need to respect the following constraints:
The separator between the figure ID and the concepts has to be a semicolon (;).
The file to upload must be a .txt file.
The initially images can be jpg or gif or png.
Each figure ID of the test set must be included in the runfile exactly once (even if there is no result).
The result cannot be specified more than once for the same figure ID.

Task 2: Identify Stego Images
For the submission of the task we expect the following format:
<Figure-ID>;<yes/no> ---> <Figure-ID>;<1/0>
e.g.:
1741_02;1 if the image includes stego
1742_02;0 if the image does NOT include stego
1743_02;1 if the image includes stego

You need to respect the following constraints:
The separator between the figure ID and the description has to be be a semicolon (;).
The file to upload must be a .txt file.
Each figure ID of the test set must be included in the runfile exactly once.
The result cannot be specified more than once for the same figure ID.

Task 3: Retrieve the Message
For the submission of the task we expect the following format:
<Figure-ID>;<stego>
e.g.:
1743_03;abcdef if the image includes the hidden message absdef
1743_03;y5fg3687 if the image includes the hidden message y5fg3687

You need to respect the following constraints:
The separator between the figure ID and the description has to be a semicolon (;).
The file to upload must be a .txt file.
Each figure ID of the testset must be included in the runfile exactly once.
The result cannot be specified more than once for the same figure ID.

Contact

Organizers

  • Narciso Garcia, Professor, Dr., Grupo de Tratamiento de Imágenes, Dpto. Señales, Sistemas y Radiocomunicaciones, E.T.S. Ingenieros Telecomunicación, Spain, narciso@gti.ssr.upm.es
  • Ergina Kavallieratou, Associate Professor, Dr, AIlab, Department of Information & Communication Systems Engineering, University of the Aegean, Greece, kavallieratou@aegean.gr
  • Carlos Roberto del Blanco, Assistant Professor, Dr., Grupo de Tratamiento de Imágenes, Dpto. Señales, Sistemas y Radiocomunicaciones, E.T.S. Ingenieros de Telecomunicación, cda@gti.ssr.upm.es
  • Carlos Cuevas Rodríguez, Assistant Professor, Dr., Grupo de Tratamiento de Imágenes, Dpto. Señales, Sistemas y Radiocomunicaciones, E.T.S. Ingenieros de Telecomunicación, Spain, ccr@gti.ssr.upm.es
  • Nikos Vasillopoulos, Phd, Postdoc, AIlab, Department of Information & Communication Systems Engineering, University of the Aegean, Greece, nvasilopoulos@aegean.gr
  • Konstantinos Karampidis, Msc, Phd student, University of the Aegean, Greece, karampidis@aegean.gr

For questions over the Security task e-mail: Imageclefsecurity@aegean.gr

Results

Recommended Reading

[1] California Institute of Technology, “Caltech256.” [Online]. Available:
http://www.vision.caltech.edu/Image_Datasets/Caltech256/. [Accessed: 14-Jan-2018].
[2] “UC Berkeley Computer Vision Group - Contour Detection and Image Segmentation - Resources.”
[Online]. Available: https://www2.eecs.berkeley.edu/Research/Projects/CS/vision/grouping/reso....
[Accessed: 02-Jun-2018].
[3] K. Karampidis and G. Papadourakis, “File Type Identification for Digital Forensics,” Springer
International Publishing, 2016, pp. 266–274.
[4] K. Karampidis, E. Kavallieratou, and G. Papadourakis, “A review of image steganalysis techniques
for digital forensics,” J. Inf. Secur. Appl., vol. 40, pp. 217–235, Jun. 2018.
[5] K. Karampidis, E. Kavallieratou, and G. Papadourakis, “Comparison of Classification algorithms
for File Type Detection,” Polibits, vol. 56, pp. 15–20, 2018.
[6] J. D. Evensen, S. Lindahl, and M. Goodwin, “Filetype Detection Using Naïve Bayes and
n-gram Analysis,” Norwegian Information Security Conference, NISK, vol. 7, no. 1.
Fredrikstad, 2014
[7] I. Ahmed, K. Lhee, H. Shin, and M. Hong, “Fast content-based file-type identification,” in
7th Annual IFIP WG 11.9 International Conference on Digital Forensics, 2011, pp. 65–75.
[8] Pevny T, Bas P, Fridrich J. Steganalysis by subtractive pixel adjacency matrix. IEEE
Transactions on Information Forensics and Security. 2010 vol: 5 (2) pp: 215-224
[9] Fridrich J, Kodovsky J. Rich models for steganalysis of digital images. IEEE Transactions
on Information Forensics and Security 2012;7(3):868–82
[10] Devi M, Sharma N. Improvements of steganography parameter in binary images and JPEG
images against steganalysis. International Journal of Engineering Sciences and Research
Technology 2013;2(8).
[11] Harmsen JJ, Pearlman WA. Steganalysis of additive-noise modelable information hiding.
In: Security and watermarking of multimedia contents; 2003. p. 131–42
[12] Kodovsky J, Fridrich J, Holub V. Ensemble classifiers for steganalysis of digital media.
IEEE Transactions on Information Forensics and Security 2012;7(2):432–44.

Helpful tools and resources

https://www.garykessler.net/library/file_sigs.html
https://www.cs.waikato.ac.nz/ml/weka/
https://www.garykessler.net/library/fsc_stego.html