ZILtoid1991@lemmy.world to 196@lemmy.blahaj.zoneEnglish · 10 days agoRule2346757123926472142107lemmy.worldimagemessage-square12linkfedilinkarrow-up1232arrow-down11
arrow-up1231arrow-down1imageRule2346757123926472142107lemmy.worldZILtoid1991@lemmy.world to 196@lemmy.blahaj.zoneEnglish · 10 days agomessage-square12linkfedilink
minus-squarePhobosAnomaly@feddit.uklinkfedilinkEnglisharrow-up21·10 days agoI’m trying to figure out why it has output incremental numbers. It seems like an oddly specific pattern to push out.
minus-squareWirlocke@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up1·6 days agoLLMs don’t see numbers as numbers, they see them as tokens which is like a word or piece of a word. So “123456789” is like a single word to the LLM because it’s a common enough string of characters. This is also why they struggle with math.
minus-squareitslilith@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up29·10 days agoThat’s probably the most common way numbers are arranged in the training data
minus-squaregetFrog@piefed.sociallinkfedilinkEnglisharrow-up9·10 days agoWhy tf is it training on the switch/case statement of my calculator program? Friggin plagiarism man
minus-squarePhobosAnomaly@feddit.uklinkfedilinkEnglisharrow-up8·10 days agoAh fair enough. Makes sense that it’s something straightforward. Cheers.
minus-squareRugnjr@lemmy.blahaj.zonelinkfedilinkEnglisharrow-up1·edit-28 days agoPretty sure it’s been edited.
I’m trying to figure out why it has output incremental numbers.
It seems like an oddly specific pattern to push out.
LLMs don’t see numbers as numbers, they see them as tokens which is like a word or piece of a word.
So “123456789” is like a single word to the LLM because it’s a common enough string of characters. This is also why they struggle with math.
That’s probably the most common way numbers are arranged in the training data
Why tf is it training on the switch/case statement of my calculator program? Friggin plagiarism man
Ah fair enough. Makes sense that it’s something straightforward. Cheers.