Created attachment 1993 [details] screenshot See the attached screenshot. In some messages some characters show as underscores. I have noticed this a lot in feed https://bivol.bg/feed. Other RSS readers show these same articles without issues. I also notice that opening the message source in KWrite shows readable characters. So it is something in the way Claws Mail renders the text. Note: It is not just about Cyrillic. I have seen it (though rarely) in English articles too.
*** Bug 4229 has been marked as a duplicate of this bug. ***
This is caused by our html-to-text parser not being aware of multi-byte characters. It parses the messages piece by piece, and sometimes multi-byte character ends up split between the two separate parts. That causes rest of the message to be displayed as underscores. This is easily observable by increasing or decreasing value of SC_HTMLBUFSIZE in html.c by just one. The message(s) which were displayed incorrectly before, will be displayed correctly now, or the underscores will be shown starting from different position in their text. Unfortunately, the parser works in a very convoluted way, and I can't find a way to easily fix this without completely rewriting it. Hopefully someone else can.
I also hope that. Additionally I hope you can raise the priority of this one as it is a serious bug which makes messages unreadable.
Some details: - Cyrillic unreadable both in message headers (sometimes) and message body (sometimes, unrelated toheaders); headers not in html, but in qouted-printable strings; - Unreadable characters with HTML-viewing off in config; I think it may be two different problems.
*** Bug 4292 has been marked as a duplicate of this bug. ***
Some more info. STR: 1. Pick an RSS message which shows underscores 2. Open its source in a text editor 3. Add a single character anywhere in the body, save 4. In Claws - click another message, then back the problematic one Result: No underscores! 5. Undo the editing, save. 6. Repeat 4. Result: No underscores. The problem with this workaround though is that on next refresh of the feed, the message may be replaced with its original. I have been looking at the source code of RSSyl but couldn't find anything. Too complicated for me. I hope an expert can have a look.
*When I say "no underscores" I mean text becomes readable.
Look, all messages with length < SC_HTMLBUFSIZE (8192 bytes) display corrected.I think sc_html_read_line() has bug.
Created attachment 2255 [details] Patch to use ringbuffer and fix bug
Effect appear on plain text utf-8 messages > 8K. For me it's often RSS messages.
Created attachment 2527 [details] Patch for ringbuffer for parser Modern patch