Description
What version of Go are you using (go version
)?
$ go version go version go1.16.6 linux/amd64
Does this issue reproduce with the latest release?
Yes
What operating system and processor architecture are you using (go env
)?
go env
Output
$ go envGO111MODULE=""
GOARCH="amd64"
GOBIN=""
GOCACHE="/home//.cache/go-build"
GOENV="/home//.config/go/env"
GOEXE=""
GOFLAGS=""
GOHOSTARCH="amd64"
GOHOSTOS="linux"
GOINSECURE=""
GOMODCACHE="/home//go/pkg/mod"
GONOPROXY=""
GONOSUMDB=""
GOOS="linux"
GOPATH="/home//go"
GOPRIVATE=""
GOPROXY="https://proxy.golang.org,direct"
GOROOT="/usr/lib/go"
GOSUMDB="sum.golang.org"
GOTMPDIR=""
GOTOOLDIR="/usr/lib/go/pkg/tool/linux_amd64"
GOVCS=""
GOVERSION="go1.16.6"
GCCGO="gccgo"
AR="ar"
CC="gcc"
CXX="g++"
CGO_ENABLED="1"
GOMOD="/dev/null"
CGO_CFLAGS="-g -O2"
CGO_CPPFLAGS=""
CGO_CXXFLAGS="-g -O2"
CGO_FFLAGS="-g -O2"
CGO_LDFLAGS="-g -O2"
PKG_CONFIG="pkg-config"
GOGCCFLAGS="-fPIC -m64 -pthread -fmessage-length=0 -fdebug-prefix-map=/tmp/go-build***=/tmp/go-build -gno-record-gcc-switches"
What did you do?
https://play.golang.org/p/zSGcE1no8yN
package main
import (
"fmt"
"golang.org/x/net/idna"
)
func main() {
eszett, _ := idna.Lookup.ToASCII("ß")
fmt.Printf("ß => %s", eszett)
}
What did you expect to see?
I wish that ß
was kept as ß
as in Firefox and Safari.
ß => ß
I am currently using the idna
package to develop DNS-related tools (https://github.com/favonia/cloudflare-ddns), and I hope there is a predefined profile that uses non-transitional IDNA2008 processing. Despite the warnings that the actual Lookup
profile could evolve over time, I understand that current software could rely on its specific behavior. Therefore, I propose adding a new profile IDNA2008Lookup
or NontransitionalLookup
(or some other good name---I am not a native English speaker anyways) as follows:
IDNA2008Lookup = &Profile{options{
transitional: false,
useSTD3Rules: true,
checkHyphens: true,
checkJoiners: true,
trie: trie,
fromPuny: validateFromPunycode,
mapping: validateAndMap,
bidirule: bidirule.ValidString,
}}
Alternatively, the idna
can provide new methods to create new profiles from the existing ones. It would then be trivial to create such a profile from existing Lookup
profile (especially after the bugfix https://go-review.googlesource.com/c/text/+/317729/ is merged). Here is the pseudocode demonstrating the idea:
func main() {
eszett, _ := idna.Lookup.Derive(idna.Transitional(false)).ToASCII("ß")
fmt.Printf("ß => %s", eszett)
}
What did you see instead?
ß
is mapped to ss
, as in Chrome/Chromium.
ß => ss
The Lookup
profile currently implements what's called transitional processing in Unicode TS #46. It is the current behavior of Chrome/Chromium, though there is an open issue discussing a possible switch.
Related Issues
This is closely related to #46001, which proposed to change the default behavior of net/http
. I like #46001 very much, but chose to propose something more conservative in case there's any objection to #46001. This proposal merely adds a new helpful definition to facilitate non-transitional processing. Whether #46001 should be accepted or in general whether the standard library and other libraries should change their behaviors is out of the scope of this issue, though those changes would probably benefit from this proposal. I strongly believe the Go language should make it easy to write software using non-transitional IDNA2008 processing even if the standard library sticks to the current behavior.
Further Justification
It is true that one can already define desirable profiles by carefully combining the correct set of options. I found the construction very unintuitive and strongly prefer a predefined profile ready to use.
Implementation Details and Auxiliary Changes
Introducing a new profile should be trivial. In fact, the above quoted code is almost all the necessary changes (modulo documentation and testing). Relatedly, the comment It is used by most browsers when resolving domain names.
for Transitional
is no longer accurate and should also be changed, in my opinion.