Well, you're on your way. I expected that they would vary a pretty good amount. Even if you had one byte per pixel, a ten percent change might not be all that noticeable.

The next thing I would try would be to section the image into 9,16, or 25 square images (those are square numbers, of course). Then average all the bytes, and look for a change when the average changes by 10% or so. The idea being that the change in an individual pixel would probably be pretty silly, but the change in the average value of a region of the image would be significant.

If the camera was pointed at a blank wall, and the individual pixels are changing by 10% or so, would the average value change by 10%? Probably not, unless those changes are biased in one direction or the other. Therefore, averaging a bundle of pixels should dampen the fluctuations in any individual pixel. Thus, you want a number of pixels to average, but what number? Well, if your image is square, then divide it into sufficient sub-sections that a change in a single subsection would be significant. This could be as big as averaging the entire image, and looking for a change as small as 1% or so. Alternatively, if the field of view is wide enough, you might want to catch a change in a quarter of the image, a ninth of the image, etc.

An alternative to consider would be to think about what would happen if the camera was looking at a room, and a person walked into the room. They would intrude onto one side of the image, then move across the image. Therefore, averaging a couple of columns of pixels might also make sense.

The last consideration would be the rate of change. If a light is flipped on in the room with the camera, all pixels will jump upward by a considerable amount, as will the average (any average). However, if you are comparing images every half hour, then sunrise would have the same impact. Thus, do you compare by the second, the minute, etc. The answer depends on what you are expecting to capture.