Skip to content

Micropython binding for the ESP32 DL AI vision models like face detection / recognition, imagenet classifier or pedestrian (human) detection

License

Notifications You must be signed in to change notification settings

cnadler86/mp_esp_dl_models

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

82 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

ESP DL MicroPython Binding

This is a MicroPython binding for ESP-DL (Deep Learning) models that enables face detection, face recognition, human detection, and image classification on ESP32 devices.

Donate

I spent a lot of time and effort to make this. If you find this project useful, please consider donating to support my work. Donate

Available Models

  • FaceDetector: Detects faces in images and provides bounding boxes and facial features
  • FaceRecognizer: Recognizes enrolled faces and manages a face database
  • HumanDetector: Detects people in images and provides bounding boxes
  • ImageNet: Classifies images into predefined categories

Installation & Building

Precompiled Images

You can find precompiled images in two ways:

  1. In the Actions section for passed workflows under artifacts
  2. By forking the repo and manually starting the action

Building from Source

  1. Clone the required repositories:
git clone https://github.com/cnadler86/mp_esp_dl_models.git
git clone https://github.com/cnadler86/micropython-camera-API.git
git clone https://github.com/cnadler86/mp_jpeg.git
  1. Build the firmware: Make sure you have the complete ESP32 build environment for MicroPython available.
cd boards/
idf.py -D MICROPY_DIR=<micropython-dir> -D MICROPY_BOARD=<BOARD_NAME> -D MICROPY_BOARD_VARIANT=<BOARD_VARIANT> -B build-<your-build-name> build
cd build-<your-build-name>
python ~/micropython/ports/esp32/makeimg.py sdkconfig bootloader/bootloader.bin partition_table/partition-table.bin micropython.bin firmware.bin micropython.uf2

Module Usage

Common Requirements

All models require input images in RGB888 format. You can use mp_jpeg to decode camera images to the correct format.

FaceDetector

The FaceDetector module detects faces in images and can optionally provide facial feature points.

Constructor

FaceDetector(width=320, height=240, features=True)

Parameters:

  • width (int, optional): Input image width. Default: 320
  • height (int, optional): Input image height. Default: 240
  • features (bool, optional): Whether to return facial feature points. Default: True

Methods

  • run(framebuffer)

    Detects faces in the provided image.

    Parameters:

    • framebuffer: RGB888 image data (required)

    Returns: List of dictionaries with detection results, each containing:

    • score: Detection confidence (float)
    • box: Bounding box coordinates [x1, y1, x2, y2]
    • features: Facial feature points [(x,y) coordinates for: left eye, right eye, nose, left mouth, right mouth] if enabled, None otherwise

FaceRecognizer

The FaceRecognizer module manages a database of faces and can recognize previously enrolled faces.

Constructor

FaceRecognizer(width=320, height=240, db_path="face.db")

Parameters:

  • width (int, optional): Input image width. Default: 320
  • height (int, optional): Input image height. Default: 240
  • db_path (str, optional): Path to the face database file. Default: "face.db"

Methods

  • run(framebuffer)

    Detects and recognizes faces in the provided image.

    Parameters:

    • framebuffer: RGB888 image data (required)

    Returns: List of dictionaries with recognition results, each containing:

    • score: Detection confidence
    • box: Bounding box coordinates [x1, y1, x2, y2]
    • features: Facial feature points (if enabled)
    • person: Recognition result containing:
      • id: Face ID
      • similarity: Match confidence (0-1)
      • name: Person name (if provided during enrollment)
  • enroll(framebuffer, validate=False, name=None)

    Enrolls a new face in the database.

    Parameters:

    • framebuffer: RGB888 image data
    • validate (bool, optional): Check if face is already enrolled. Default: False
    • name (str, optional): Name to associate with the face. Default: None

    Returns:

    • ID of the enrolled face
  • delete_face(id)

    Deletes a face from the database.

    Parameters:

    • id (int): ID of the face to delete
  • print_database()

    Prints the contents of the face database.

HumanDetector

The HumanDetector module detects people in images.

Constructor

HumanDetector(width=320, height=240)

Parameters:

  • width (int, optional): Input image width. Default: 320
  • height (int, optional): Input image height. Default: 240

Methods

  • run(framebuffer)

    Detects people in the provided image.

    Parameters:

    • framebuffer: RGB888 image data

    Returns: List of dictionaries with detection results, each containing:

    • score: Detection confidence
    • box: Bounding box coordinates [x1, y1, x2, y2]

ImageNet

The ImageNet module classifies images into predefined categories.

Constructor

ImageNet(width=320, height=240)

Parameters:

  • width (int, optional): Input image width. Default: 320
  • height (int, optional): Input image height. Default: 240

Methods

  • run(framebuffer)

    Classifies the provided image.

    Parameters:

    • framebuffer: RGB888 image data

    Returns: List alternating between class names and confidence scores: [class1, score1, class2, score2, ...]

Usage Examples

Face Detection Example

from espdl import FaceDetector
import camera
from jpeg import Decoder

# Initialize components
cam = camera.Camera()
decoder = Decoder()
face_detector = FaceDetector()

# Capture and process image
img = cam.capture()
framebuffer = decoder.decode(img)  # Convert to RGB888
results = face_detector.run(framebuffer)

if results:
    for face in results:
        print(f"Face detected with confidence: {face['score']}")
        print(f"Bounding box: {face['box']}")
        if face['features']:
            print(f"Facial features: {face['features']}")

Face Recognition Example

from espdl import FaceRecognizer
import camera
from jpeg import Decoder

# Initialize components
cam = camera.Camera()
decoder = Decoder()
recognizer = FaceRecognizer(db_path="/faces.db")

# Enroll a face
img = cam.capture()
framebuffer = decoder.decode(img)
face_id = recognizer.enroll(framebuffer, name="John")
print(f"Enrolled face with ID: {face_id}")

# Later, recognize faces
img = cam.capture()
framebuffer = decoder.decode(img)
results = recognizer.run(framebuffer)

if results:
    for face in results:
        if face['person']:
            print(f"Recognized {face['person']['name']} (ID: {face['person']['id']})")
            print(f"Similarity: {face['person']['similarity']}")

Benchmark results

The following table shows the frames per second (fps) for different image sizes and models. The results are based on a test with a 2MP camera and a ESP32S3.

Frame Size FaceDetector HumanDetector
QQVGA 14.5 6.6
R128x128 21 6.6
QCIF 19.7 6.5
HQVGA 18 6.3
R240X240 16.7 6.1
QVGA 15.2 6.6
CIF 13 5.5
HVGA 11.9 5.3
VGA 8.2 4.4
SVGA 6.2 3.8
XGA 4.1 2.8
HD 3.6 2.6

Notes & Best Practices

  1. Image Format: Always ensure input images are in RGB888 format. Use mp_jpeg for JPEG decoding from camera.

  2. Memory Management:

    • Close/delete detector objects when no longer needed
    • Consider memory constraints when choosing image dimensions
  3. Face Recognition:

    • Enroll faces in good lighting conditions
    • Multiple enrollments of the same person can improve recognition
    • Use validate=True during enrollment to avoid duplicates
  4. Storage:

    • Face database is persistent across reboots
    • Consider backing up the face database file

About

Micropython binding for the ESP32 DL AI vision models like face detection / recognition, imagenet classifier or pedestrian (human) detection

Topics

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Sponsor this project