Page 25 of 42

Re: Report Site Issues Here

Posted: 28 February 19 11:49 am
by Laighside Legends
Caches with names containing non-ASCII characters still seem to be causing us problems. For example, see https://geocaching.com.au/cache/gc7xan2

The name starts with a non-ASCII character but for some reason, the rest of the name is missing to. The individual GPX file from Groundspeak shows the entire name correctly with unicode. But when I import it to GCA, something goes wrong and the entire name is lost. (GSAK was not involved at all so that's not the problem)

I tried it locally on my machine and the importer swapped the non-ASCII character(s) for question marks and then left the rest of the name intact. Not the best solution but it's probably ok. But why doesn't the website do the same?

Re: Report Site Issues Here

Posted: 28 February 19 12:52 pm
by caughtatwork
It's not a UTF-8 character so:
a. The GPX file should not contain it as the character encoding is UTF-8
b. Our DB is UTF-8 encoded so should not store it.
Anything after that is up for grabs.
Please email Groundspeak and advise them to stop using non UTF-8 characters in their UTF-8 encoded file.

Re: Report Site Issues Here

Posted: 28 February 19 3:27 pm
by Laighside Legends
Are you sure? The first character in the example above appears to be a valid UTF-8 character to me.

The 4 bytes in the GPX file are:
f0 9f a6 84

Which is exactly how U+1F984 is supposed to be encoded (according to https://en.wikipedia.org/wiki/UTF-8)

Can you explain which part of that you think doesn't fit the UTF-8 specs?

Re: Report Site Issues Here

Posted: 28 February 19 3:43 pm
by caughtatwork
utf8 is limited to the 1- to 3-byte utf8 codes. This leaves out Emoji.

Re: Report Site Issues Here

Posted: 28 February 19 3:53 pm
by Laighside Legends
Are we using an old version of UTF-8 or something?

Every reference I look at says UTF-8 can go up to 4 bytes per character. Which suggests that the GPX file meets the standard.

Re: Report Site Issues Here

Posted: 28 February 19 3:56 pm
by CraigRat
Our instance of MySQL/Mariah uses utf8 , however there is a utf8mb4 format that extends, however I don't know what the ramifications are if we change our encoding

Re: Report Site Issues Here

Posted: 28 February 19 4:12 pm
by Laighside Legends
CraigRat wrote:Our instance of MySQL/Mariah uses utf8 , however there is a utf8mb4 format that extends, however I don't know what the ramifications are if we change our encoding
Ah, that makes sense. Looks like MySQL only started supporting 4 byte UTF-8 in 2010. I guess we should probably think about upgrading at some point...

Or at the very least, filter out any 4 byte characters before they get to the database.

Re: Report Site Issues Here

Posted: 28 February 19 5:55 pm
by caughtatwork
Laighside Legends wrote:
CraigRat wrote:Our instance of MySQL/Mariah uses utf8 , however there is a utf8mb4 format that extends, however I don't know what the ramifications are if we change our encoding
Ah, that makes sense. Looks like MySQL only started supporting 4 byte UTF-8 in 2010. I guess we should probably think about upgrading at some point...

Or at the very least, filter out any 4 byte characters before they get to the database.
As a senator, please read the Senate thread titled: Remove 4 byte UTF-8 emoj and see what the senate decided to do.

Re: Report Site Issues Here

Posted: 28 February 19 11:00 pm
by Laighside Legends
It's a bit all over the place but this seemed to be the conclusion in the thread from 2016:
Long term:
I think we should update to UTF8MB4 but that requires the new server and some time.

Short term:
Strip the emoji on input and avoid the garbage from GC.
Neither of these things seemed to have actually happened though. Instead of striping out the 4 byte characters, the entire name gets removed.

Re: Report Site Issues Here

Posted: 01 March 19 6:27 am
by caughtatwork
The emoji gets stripped, as discussed, as agreed.

Re: Report Site Issues Here

Posted: 01 March 19 10:02 am
by Laighside Legends
Ok then, if you insist, let's have caches without names. It's not useful to anyone but at least it meets some ill-defined specs from years ago.

Re: Report Site Issues Here

Posted: 01 March 19 2:12 pm
by caughtatwork
Laighside Legends wrote:Ok then, if you insist, let's have caches without names. It's not useful to anyone but at least it meets some ill-defined specs from years ago.

Why are you being so antagonistic?
When we get a GPX file with an emoji we strip the emoji from the name.
I don't know what happened when you tried to load the GPX file but when I load it the emoji is stripped out and the cache name minus the emoji is used.
Please try a different cache with an emoji and report to back as to whether what I am saying is right or a bug.
If there is something wrong please let me know the cache you're loading.

Re: Report Site Issues Here

Posted: 01 March 19 5:17 pm
by Laighside Legends
This appears to be a pretty minor bug but for some reason we have to spend 3 days arguing before anything is done about it.

First you blame Groundspeak with no evidence. Then you insist there is no bug and everything is fine even though several caches clearly don't have names. (I acknowledge the bug could be in something other than Groundspeak or GCA but I can't see how this is possible if I haven't used GSAK or other 3rd party software?)

The cache I mentioned in the first post is just one of an entire powertrail without names. I didn't do the initial load, so someone else could've uploaded a dodgy GPX to start with. I then tried to update it with a GPX file directly from Groundspeak and it didn't fix the problem. I then tried to replicate the problem on my machine but it worked (mostly) as it should - the caches had names anyway. Hence I was a bit confused as to where the problem is...

Re: Report Site Issues Here

Posted: 01 March 19 5:28 pm
by caughtatwork
Nothing has been done. The bug seems to be in the data not the code. If you load the GPX file you get a name.i don't know who did what but if it works for you now then great. Maybe if you load a whole PQ you can get the names corrected. That's probably better than getting into an argument about technical items which may behave differently of your own installation and ours. If there is no issues when you load the PQ then we're done.

Re: Report Site Issues Here

Posted: 12 March 19 10:30 pm
by Luckyl10n
Hi Guys

I just published GA13696 and used incorrect posted coords first up. These have been corrected and it shows up in Marrickville dZ, but on the cache page it shows as Strathfield. Can you fix for me please?

Thanx
LL