Show Idle (>14 d.) Chans


← 2021-06-11 | 2021-06-13 →
19:14 thimbronion One way to tackle all of the junk data in the encyclopdia entries is to run through each entry and check each word against a dictionary and name dictionary to generate a list of ocr junk associated with entries.
19:18 whaack seems like that'll help a lot but it'll leave the most confusing OCR mistakes for readers (ones that accidently map to another word)
19:19 thimbronion True. Idk how to handle that case.
~ 32 minutes ~
19:51 whaack thimbronion: All I can think of is running two OCRs and then flagging all the mismatches.
19:52 thimbronion whaack: good idea.
19:52 whaack but I imagine that the OCR algorithm itself probably has this type of check built in, so you'd probably need a really different OCR
19:54 thimbronion If only I had like 20 slavegirls.
19:56 whaack lulz
19:56 thimbronion I could just reward them for finding errors with whippings.
← 2021-06-11 | 2021-06-13 →