Remove metadata from user-uploaded images in Django
When working with user-generated content in the form of uploaded images, it's a good idea to strip the metadata that is often embedded in the image files. This metadata can include camera information and settings as well as geotags that describe where a photo was taken. In the interest of protecting users' privacy, this data should be removed before the photo is stored and shown to other users. It's fairly straightforward to do this using the well-known exiftool and integrate it into a Django web application using the standard forms framework or in a Django REST Framework serializer.
The code below was developed and tested on Python 3.x / Django 1.8.
models.py
from django.db import models
class Image(models.Model):
image = models.ImageField()
image_utils.py
from io import BytesIO
import subprocess
def strip_metadata(fp): # fp is a Django UploadedFile
args = ['exiftool', '-All=', '-']
p = subprocess.Popen(args, stdin=subprocess.PIPE, stdout=subprocess.PIPE)
out, err = p.communicate(input=fp.read())
return BytesIO(out)
forms.py
from django import forms
from . import models
from .image_utils import strip_metadata
class ImageForm(forms.ModelForm):
class Meta:
model = models.Image
image = forms.ImageField()
def clean_image(self):
f_orig = self.cleaned_data['image']
fn = f_orig.name
sanitized_image = strip_metadata(f_orig)
f_new = File(sanitized_image)
f_new.name = fn
return f_new
serializers.py (if using Django REST Framework)
from django.core.files import File
from rest_framework import serializers
from . import models
from .image_utils import strip_metadata
class ImageSerializer(serializers.ModelSerializer):
class Meta:
model = models.Image
image = serializers.ImageField()
def save(self, **kwargs):
f_orig = self.validated_data['image']
fn = f_orig.name
sanitized_image = strip_metadata(f_orig)
f_new = File(sanitized_image)
f_new.name = fn
self.validated_data['image'] = f_new
return super().save(**kwargs)
As the image data is processed fully in memory on-the-fly by exiftool using a pipe, there is no need to create any temporary files on the filesystem - the newly scrubbed image transparently replaces the original upload and is ready to go. However, since this can potentially be a bottleneck and a vector for denial-of-service attacks using very large files, you should explicitly set the maximum upload/POST body size to a reasonable value via your web server and/or check the file size using a validator function on the ImageField.