Nov 7, 20232 min read

Learning more about how the A.I. works

As you know, we use a large language model A.I. function to compare your RV transcript to both photos in each trial. The A.I. function is NOT perfect, but in my opinion, overall it's better than a 'human' judge. Humans cannot think statistically, so when we see a match between the word "sea" and some water in a photo, we immediately assign it a high confidence score. When in fact, a lot of photos have water in them, so statistically speaking, it would not merit a very high z-score.

The following is an example of where the A.I. did a relatively poor job of scoring this RV transcript. However, I substituted variations of the RV transcript to see how the A.I. would react, and it's scoring is very telling about how it works. If you generalize too much in your RV words, you will score low, but if you are too specific, and get the specifics wrong, you will be penalized. If you are specific and get it correct, then you will score very high. If you are general and get the generalizations correct, you will score ok, but it will seem low to you due to our "meat computer" brains. If you add a bunch of words that don't really match anything in the image with something specific which is a match, then your score will be lower generally.

The RV transcript was "bird. lifeform. animal. blue. sea" and the resulting confidence score was a fairly low .24

Following is how the A.I. scored variations of the RV transcript.

bird. lifeform. animal. blue. sea = .24 (current)
bird = 1.3 (ok, so seems like the other words aside from bird just lowered the score)
bird. animal. blue. sea = .7 (removed lifeform and the score increased)
sea = -.5 (could be a lake, and there is lots of bodies of water in other pics, so that might be about right)
duck = 1.3 (so it knows there are birds in the image)
ducks = 3 (wow, so statistically there is a big difference between one bird and more than one)
birds = .5 (ok, so it knows the birds are ducks)
animal = .3 (too general. Are birds animals though?)
lifeform = -2.3 (I'm not a fan of using tradition RV descriptors like "lifeform" because generally ppl dont use that term and if the AI is trained using regular ppl's input, then I'm not surprised that got a bad score.)

Learning more about how the A.I. works

Recent Posts

Comentarios