Tolerate cut-off UTF-8 messages #2

Closed
opened 2020-10-10 20:37:04 +02:00 by p · 0 comments
Owner

irc_to_utf8() transcodes from cp1252 if utf8_validate() fails but e.g. Russian text is very often cut off by servers and this makes it illegible; incomplete trailing characters should be replaced with the Unicode replacement character instead (U+FFFD) as a special case.

I'm not entirely sure whether ISO/Windows encodings are used at all anymore, so a universal replacement character pass might be more appropriate.

`irc_to_utf8()` transcodes from cp1252 if `utf8_validate()` fails but e.g. Russian text is very often cut off by servers and this makes it illegible; incomplete trailing characters should be replaced with the Unicode replacement character instead (U+FFFD) as a special case. I'm not entirely sure whether ISO/Windows encodings are used at all anymore, so a universal replacement character pass might be more appropriate.
p added this to the v1.0.0 milestone 2020-10-10 20:37:04 +02:00
p self-assigned this 2020-10-10 20:37:04 +02:00
p closed this issue 2020-10-12 23:47:09 +02:00
Sign in to join this conversation.
No Label
WIP
easy
priority
No Milestone
No Assignees
1 Participants
Notifications
Due Date
The due date is invalid or out of range. Please use the format 'yyyy-mm-dd'.

No due date set.

Dependencies

No dependencies set.

Reference: p/xK#2
No description provided.