Tolerate cut-off UTF-8 messages #2

Closed
opened 2020-10-10 20:37:04 +02:00 by p · 0 comments
Owner

irc_to_utf8() transcodes from cp1252 if utf8_validate() fails but e.g. Russian text is very often cut off by servers and this makes it illegible; incomplete trailing characters should be replaced with the Unicode replacement character instead (U+FFFD) as a special case.

I'm not entirely sure whether ISO/Windows encodings are used at all anymore, so a universal replacement character pass might be more appropriate.

`irc_to_utf8()` transcodes from cp1252 if `utf8_validate()` fails but e.g. Russian text is very often cut off by servers and this makes it illegible; incomplete trailing characters should be replaced with the Unicode replacement character instead (U+FFFD) as a special case. I'm not entirely sure whether ISO/Windows encodings are used at all anymore, so a universal replacement character pass might be more appropriate.
p added this to the v1.0.0 milestone 2020-10-10 20:37:04 +02:00
p self-assigned this 2020-10-10 20:37:04 +02:00
p closed this issue 2020-10-12 23:47:09 +02:00
Sign in to join this conversation.
1 Participants
Notifications
Due Date
No due date set.
Dependencies

No dependencies set.

Reference: p/xK#2
No description provided.