• ChaoticNeutralCzech@feddit.org
      link
      fedilink
      English
      arrow-up
      1
      ·
      edit-2
      2 years ago

      A bot strips away all spaces and letters that aren’t A, T, C or G, then treats the rest like a genetic sequence and checks it against some database.

      Presumably, it runs through many terabytes of data for each comment, as the Gallinula chloropus alone has about 51 billion base pairs, or some 15 GiB. The Genome Ark DB, which has sequences of two common moorhens, contains over 1 PiB. I wonder if a bored sequencing lab employee just wrote it to give their database and computing servers something to do when there is no task running.

      No, I won’t download the genome and check how close the “closest match” is but statistically, 93 base pairs are expected to recur every 2186 bits or once per 1040 PiB. By evaluating the function (4-1)m × mℂ93 ≥ 493 ÷ (pebi × 8), one can expect the 93-base sequence to appear at least once in a 1 PiB database if m ≥ 32 mismatches or over ⅓ are allowed. Not great.

      This assumes true randomness, which is not true of naturally occuring DNA nor letters in English text, but should be in the right ballpark. Maybe fewer if you account for insertions/deletions.

  • atomicorange@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    Implied fact: a baby is capable of having a religion, despite its inability to comprehend the concept.

  • irmoz@lemmy.world
    link
    fedilink
    English
    arrow-up
    1
    ·
    2 years ago

    7th implied fact: the baby’s religion somehow plays a role in your deciding whether or not to hit it with a bat.

    • T156@lemmy.world
      link
      fedilink
      English
      arrow-up
      1
      ·
      2 years ago

      Eigth implied fact: The baby is durable enough to be hit by a baseball bat hard enough to fling it out of the stadium, and remain in one piece.