Hog PeopleDetector() does not detect people

Hi Community

I am currently doing the OpenCV Course and I am testing something with the HogDescriptor PeopleDetector().
I did the exercise 3.5 and with the example image it works fine.

Now I am testing the same with some other images.

But with these images it does not detect correctly.



As far as I am concerned, the only parameters I can change are winStride, padding and scale, as well as the resize of the image.

I tried with different combinations of parameters, but I don’t get satisfying results. Does somebody know how to make the algorithm better, such that it detects the players correclty?

Here my code:
{
#!/usr/bin/env python

import rospy
from sensor_msgs.msg import Image
from cv_bridge import CvBridge, CvBridgeError
import cv2
import numpy as np

class LoadPeople(object):

def __init__(self):

    self.image_sub = rospy.Subscriber("/camera/rgb/image_raw",Image,self.camera_callback)
    self.bridge_object = CvBridge()

def camera_callback(self,data):
    try:
        # We select bgr8 because its the OpenCV encoding by default
        cv_image = self.bridge_object.imgmsg_to_cv2(data, desired_encoding="bgr8")
    except CvBridgeError as e:
        print(e)
    

    hog = cv2.HOGDescriptor()

    #We set the hog descriptor as a People detector
    hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

    img_nr = 1

    if img_nr == 1:
        img = cv2.imread('/home/user/catkin_ws/src/unit3_exercises/mbappe.jpg')

        #resize
        imX = 1280
        imY = 921
        img = cv2.resize(img,(imX,imY))
    else :
        img = cv2.imread('/home/user/catkin_ws/src/unit3_exercises/mancity.jpg')

        #resize
        imX = 1400
        imY = 920
        img = cv2.resize(img,(imX,imY))
    
    #Hog
    boxes, weights = hog.detectMultiScale(img, winStride=(2, 2),padding=(8, 8), scale=1.01)
    boxes = np.array([[x, y, x + w, y + h] for (x, y, w, h) in boxes])

    for (xA, yA, xB, yB) in boxes:
        
        #Center in X 
        medX = xB - xA 
        xC = int(xA+(medX/2)) 

        #Center in Y
        medY = yB - yA 
        yC = int(yA+(medY/2)) 

        #Draw a circle in the center of the box 
        cv2.circle(img,(xC,yC), 1, (0,255,255), -1)

        # display the detected boxes in the original picture
        cv2.rectangle(img, (xA, yA), (xB, yB),
                            (255, 255, 0), 2)    

    cv2.imshow('soccer',img)
            
            

    cv2.waitKey(1)

def main():
load_people_object = LoadPeople()
rospy.init_node(‘load_people_node’, anonymous=True)
try:
rospy.spin()
except KeyboardInterrupt:
print(“Shutting down”)
cv2.destroyAllWindows()

if name == ‘main’:
main()

}

Thanks in advance for any help

Hi @Dkae ,

From your images I understand that your HOG detector has Image Scaling issues.

Here is a link to all HOG based tuning parameters: HOG detectMultiScale parameters explained - PyImageSearch

Some properties like WinStride, Non-Maxima Suppression and Mean Shift Grouping are explained there so you can get an idea on how to detect multiple people on the same image.

I hope this helps you.

Regards,
Girish

Hi @girishkumar.kannan

Thanks for your response. I have looked into the theory of hog parameters and did a few more tests.
Here I show some parameter settings and the results:

Here without scaling and without non max suppression

{
boxes, weights = hog.detectMultiScale(img, winStride=(4, 4))
boxes = np.array([[x, y, x + w, y + h] for (x, y, w, h) in boxes])
}

Here with scaling factor, that I think does the most detection

{
boxes, weights = hog.detectMultiScale(img, winStride=(4, 4), scale=1.02)
boxes = np.array([[x, y, x + w, y + h] for (x, y, w, h) in boxes])
}

Here with added non max suppression

{
boxes, weights = hog.detectMultiScale(img, winStride=(4, 4), scale=1.02)
overlapThresh = 0.5
boxes = non_max_suppression_fast(boxes, overlapThresh)
}

So these are quite of the “best” results I got. So nothing really satisfying.

Adding padding does also not really change something and adding mean-shift makes it even worse, meaning it detects less. Increasing or decreasing the scaling factor also makes it worse.

So I don’t actually know how to modify my code, such that I get the desired result.
Any kind of ideas, what else can be tested?

Here the complete code:

{
#!/usr/bin/env python

import rospy
from sensor_msgs.msg import Image
from cv_bridge import CvBridge, CvBridgeError
import cv2
import numpy as np


class LoadPeople(object):

    def __init__(self):
    
        self.image_sub = rospy.Subscriber("/camera/rgb/image_raw",Image,self.camera_callback)
        self.bridge_object = CvBridge()

    def camera_callback(self,data):
        try:
            # We select bgr8 because its the OpenCV encoding by default
            cv_image = self.bridge_object.imgmsg_to_cv2(data, desired_encoding="bgr8")
        except CvBridgeError as e:
            print(e)
        

        hog = cv2.HOGDescriptor()

        #We set the hog descriptor as a People detector
        hog.setSVMDetector(cv2.HOGDescriptor_getDefaultPeopleDetector())

        img_nr = 1

        if img_nr == 1:
            img = cv2.imread('/home/user/catkin_ws/src/unit3_exercises/mbappe.jpg')
            #resize
            imX = 1280
            imY = 921
            img = cv2.resize(img,(imX,imY))
        elif img_nr == 2 :
            img = cv2.imread('/home/user/catkin_ws/src/unit3_exercises/mancity.jpg')
            #resize
            imX = 1400
            imY = 920
            img = cv2.resize(img,(imX,imY))
        elif img_nr == 3 :
            img = cv2.imread('/home/user/catkin_ws/src/unit3_exercises/lewandowski.jpg')
            #resize
            imX = 880
            imY = 552
            img = cv2.resize(img,(imX,imY))
        elif img_nr == 4 :
            img = cv2.imread('/home/user/catkin_ws/src/unit3_exercises/person.jpg')
            #resize
            imX = 306
            imY = 612
            img = cv2.resize(img,(imX,imY))
        
        #Hog
        boxes, weights = hog.detectMultiScale(img, winStride=(4, 4), scale=1.02)
        overlapThresh = 0.5
        boxes = non_max_suppression_fast(boxes, overlapThresh)
        #boxes = np.array([[x, y, x + w, y + h] for (x, y, w, h) in boxes])

        for (xA, yA, xB, yB) in boxes:
            
            #Center in X 
            medX = xB - xA 
            xC = int(xA+(medX/2)) 

            #Center in Y
            medY = yB - yA 
            yC = int(yA+(medY/2)) 

            #Draw a circle in the center of the box 
            cv2.circle(img,(xC,yC), 1, (0,255,255), -1)

            # display the detected boxes in the original picture
            cv2.rectangle(img, (xA, yA), (xB, yB),
                                (255, 255, 0), 2)    

        cv2.imshow('soccer',img)
                
                

        cv2.waitKey(1)

    # Malisiewicz et al.
def non_max_suppression_fast(boxes, overlapThresh):
	# if there are no boxes, return an empty list
	if len(boxes) == 0:
		return []
	# if the bounding boxes integers, convert them to floats --
	# this is important since we'll be doing a bunch of divisions
	if boxes.dtype.kind == "i":
		boxes = boxes.astype("float")
	# initialize the list of picked indexes	
	pick = []
	# grab the coordinates of the bounding boxes
	x1 = boxes[:,0]
	y1 = boxes[:,1]
	x2 = boxes[:,2]
	y2 = boxes[:,3]
	# compute the area of the bounding boxes and sort the bounding
	# boxes by the bottom-right y-coordinate of the bounding box
	area = (x2 - x1 + 1) * (y2 - y1 + 1)
	idxs = np.argsort(y2)
	# keep looping while some indexes still remain in the indexes
	# list
	while len(idxs) > 0:
		# grab the last index in the indexes list and add the
		# index value to the list of picked indexes
		last = len(idxs) - 1
		i = idxs[last]
		pick.append(i)
		# find the largest (x, y) coordinates for the start of
		# the bounding box and the smallest (x, y) coordinates
		# for the end of the bounding box
		xx1 = np.maximum(x1[i], x1[idxs[:last]])
		yy1 = np.maximum(y1[i], y1[idxs[:last]])
		xx2 = np.minimum(x2[i], x2[idxs[:last]])
		yy2 = np.minimum(y2[i], y2[idxs[:last]])
		# compute the width and height of the bounding box
		w = np.maximum(0, xx2 - xx1 + 1)
		h = np.maximum(0, yy2 - yy1 + 1)
		# compute the ratio of overlap
		overlap = (w * h) / area[idxs[:last]]
		# delete all indexes from the index list that have
		idxs = np.delete(idxs, np.concatenate(([last],
			np.where(overlap > overlapThresh)[0])))
	# return only the bounding boxes that were picked using the
	# integer data type
	return boxes[pick].astype("int")


def main():
    load_people_object = LoadPeople()
    rospy.init_node('load_people_node', anonymous=True)
    try:
        rospy.spin()
    except KeyboardInterrupt:
        print("Shutting down")
    cv2.destroyAllWindows()

if __name__ == '__main__':
    main()
}

Hi @Dkae ,

Since you have experimented with the available parameters and found that those do not give you favorable results, you should probably consider changing the detector.

Use another method instead of HOG. It seems to be clear that HOG is not good working with football images.

Before you change the detector model, try the HOG program on a different portrait such as family photo or group photo. Then see if the HOG detector can favorably detect all the people in the photograph. If that does not work then move on to another detector model.

The best detector I would say will be to use a pre-trained human detector. There are many human detector neural network models, choose the one that suits you.

You can also try a combination of detectors such as cascaded HOG-Haar detector or something similar. This is just an idea, I am not telling you to use this.

I cannot test or run your code, so I did not go through your code.

Regards,
Girish

This topic was automatically closed 7 days after the last reply. New replies are no longer allowed.