pphaneuf: (Default)
[personal profile] pphaneuf
Oh, my goodness. Having fought in the FidoNet charset wars (I was part of the NETDEV echo, way back then), Unicode was supposed to be my saviour, or something.

Behold, trying to keep a two-way rsync of my music library between my Mac OS X laptop and my Linux workstation. Beside the obvious duplication induced by such genius as the interactions between the case-remembering filesystem of Mac OS X and the case-sensitive filesystem of Linux (yeah, "U2" and "u2" are two totally different bands, didn't you know?), charsets come to bite my arse once more, as if I hadn't done my share already.

Some bands, albums or songs with accented characters in them, they were in ISO-8859-1 charset somewhere and UTF-8 in another. At this point, I was all happy of Linux distributions finally having switched over to UTF-8, and Mac OS X being UTF-8 as well, thinking those were old leftovers (they were) and that I just needed to rename them over to UTF-8 in order to regain my sanity.

No. Of course not. How dumb was I?

The latin accented characters can be represented in two ways using UTF-8, using the ISO-8859-1 codepoints, or using some sort of "dead character". This means that using strcmp might mark two identical-looking strings as being different.

Now, what would be your guess on Mac OS X and Linux using the same method to represent latin accented characters? Or, say, the chances of either of them using something more sophisticated than strcmp to compare strings (not that I blame them, this sounds like a ridiculously complicated problem, of the kind we were trying to get rid by kissing the charsets goodbye)?

*sobs*

P.S.: Thank Bob for Firewire and the bandwidth of a hard disk in an enclosure.

Date: 2007-01-22 04:17 pm (UTC)
From: [identity profile] azrhey.livejournal.com
So I was about to go hunt for pictures ofnekkid women to coment about in this technopost.

But then I realised I actually understood what you were talking about.

Date: 2007-01-22 04:39 pm (UTC)
From: [identity profile] pphaneuf.livejournal.com
Stupid charsets.

February 2016

S M T W T F S
 123456
7891011 1213
14151617181920
21222324252627
2829     

Most Popular Tags

Style Credit

Expand Cut Tags

No cut tags
Page generated Mar. 20th, 2026 07:08 am
Powered by Dreamwidth Studios