Skip to content

Document why "charset=UTF-8" is specified for JSON and not for XML [SPR-14715] #19280

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
spring-projects-issues opened this issue Sep 14, 2016 · 13 comments
Assignees
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: task A general task
Milestone

Comments

@spring-projects-issues
Copy link
Collaborator

spring-projects-issues commented Sep 14, 2016

Manuel Jordan opened SPR-14715 and commented

Hello

I just added the jackson-dataformat-xml module to my project and the following

@Bean
   public MappingJackson2XmlHttpMessageConverter mappingJackson2XmlHttpMessageConverter(){
            MappingJackson2XmlHttpMessageConverter converter = new MappingJackson2XmlHttpMessageConverter();
            converter.setPrettyPrint(true);
            return converter;
   }

Now my test classes are failing with the following error message:

java.lang.AssertionError: Content type expected:<application/xml> but was:<application/xml;charset=UTF-8>

Therefore in the same way the MediaType class has declared

  • APPLICATION_JSON_UTF8
  • APPLICATION_JSON_UTF8_VALUE

The same consideration would be for XML.

My RestController has currently the following:

	@GetMapping(produces={MediaType.APPLICATION_XML_VALUE, MediaType.APPLICATION_JSON_UTF8_VALUE})
	public @ResponseBody GenericCollection<Persona> findAll(){
		return new GenericCollection<>(personaService.findAll());
	}

	@GetMapping(value="/{id}", produces={MediaType.APPLICATION_XML_VALUE, MediaType.APPLICATION_JSON_UTF8_VALUE})
	public @ResponseBody Persona findOneById(@PathVariable String id){
		return personaService.findOne(id);
	}

Thank you.


Affects: 4.3 GA, 4.3.1, 4.3.2

Issue Links:

Referenced from: commits 27e87e5, 3879179

@spring-projects-issues
Copy link
Collaborator Author

Manuel Jordan commented

BTW It fails even when I add

converter.setSupportedMediaTypes(Arrays.asList(MediaType.APPLICATION_XML));

It according with the MappingJackson2XmlHttpMessageConverter API which says:

By default, this converter supports application/xml, text/xml, and application/*+xml with UTF-8 character set. This can be overridden by setting the supportedMediaTypes property.

Theoretically with converter.setSupportedMediaTypes(Arrays.asList(MediaType.APPLICATION_XML)); the UTF_8 should be removed.

is It the idea projected?

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Sep 15, 2016

Sébastien Deleuze commented

There is various things to consider here.

As described in #18209, as of Spring Framework 4.3, the produces attributes can be used to specify the media types supported without losing the charset. This is especially useful for JSON because browsers don't use UTF-8 by default if this charset in not specified in the Content-Type (unlike XML where UTF-8 is used by default if not specified).

MediaType.APPLICATION_JSON_UTF8 and MediaType.APPLICATION_JSON_UTF8_VALUE have been introduced as helpers in Spring Framework 4.2 when this improvement was not available yet, and are useful for testing since the charset is explicitly set.

I think introducing jackson-dataformat-xml broke your test because JAXB XML converter and Jackson XML converter do not behave the same. Both are using UTF-8 by default, but JAXB XML creates a Content-Type: application/xml header while the Jackson XML converter creates a Content-Type: application/xml;charset=UTF-8, but that could be seen as a side effect of the original JSON support.

Instead of adding MediaType.APPLICATION_XML_UTF8 for testing purpose, maybe we could modify MappingJackson2XmlHttpMessageConverter to not explicitly specify the charset in the Content-Type header when the default one (UTF-8) is used. That would make JAXB and Jackson XML converters more consistent and would make testing easier, while avoiding to add a new charset specific MediaType variant.

Any thoughts Rossen Stoyanchev Juergen Hoeller?

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Sep 15, 2016

Manuel Jordan commented

Hello Sébastien

Thanks by the reply.

MediaType.APPLICATION_JSON_UTF8 and MediaType.APPLICATION_JSON_UTF8_VALUE have been introduced as helpers in Spring Framework 4.2 when this improvement was not available yet, and are useful for testing since the charset is explicitly set.

Interesting. Perhaps is related with other JIRA I've created some time ago:
SPR-13600 - Add MediaType.APPLICATION_JSON_UTF8 and improve documentation. You are involved there. Valuable your support.

Agree about jackson-dataformat-xml has a different behaviour than Jaxb. I have two problems.

One I've removed all the Jaxb 2 annotations because jackson-dataformat-xml creates the data schema very different. All my test methods about Http method for GET data fail. To resolve this issue now I am using the Jackson annotations for XML from Jackson XML annotations. With this migration, I have removed the spring-oxm module. Is not necessary anymore.

Two the charset reported here.

I am agree and I understand the importance of UTF-8 because according from the other JIRA, for the 'weirds' characters such as é, ñ, ê is mandatory use UTF-8. Now it applies for XML.

Instead of adding MediaType.APPLICATION_XML_UTF8 for testing purpose, maybe we could modify MappingJackson2XmlHttpMessageConverter to not explicitly specify the charset in the Content-Type header when the default one (UTF-8) is used

You are the expert here. But friendly I will say not, because from Rest we have some kinds of clients/consumers:

  • Web Browser (used a lot for any webinar or testing development)
  • MockMvc
  • RestTemplate
  • cURL

Not sure if your idea affects Spring Hateoas too

Even if converter.setSupportedMediaTypes(Arrays.asList(MediaType.APPLICATION_XML)); works (which it is not the case) I think is better add the new constants. I mean, I think is valuable have the UTF-8 for XML and JSON.

Kind Regards

@spring-projects-issues
Copy link
Collaborator Author

spring-projects-issues commented Sep 16, 2016

Sébastien Deleuze commented

Indeed, I already linked the related #18178 issue.

My point is that characters like é, ñ, ê requires Content-Type: application/json;charset=UTF-8 in JSON to be processed correctly (see this Chrome bug as an example) while they are processed correctly with XML with Content-Type: application/xml which truly uses UTF-8 as a default. As a consequence such change (already the behavior with JAXB) should not produce any side effect for clients/consumers (unless I miss something).

Any thoughts?

@spring-projects-issues
Copy link
Collaborator Author

Manuel Jordan commented

Some thoughts about this situation?

Seems has sense work around MappingJackson2XmlHttpMessageConverter instead. It because this problem only occurs in testing.

To avoid confusions what I understood is that for production is that application/xml is practically the same than application/xml;charset=UTF-8? Am I correct?, I think a special note in the Reference documentation and javadoc would be valuable. The point or problem is where...

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

Since the Chrome bug has been fixed in Chrome 62 (we are now on Chrome 65 and it is expected to be an evergreen browser automatically updated), I am wondering if we could not switch back to MediaType.APPLICATION_JSON by default in Spring Framework 5.1 and deprecate MediaType.APPLICATION_JSON_UTF8_VALUE, since as stated in RFC 7159 "No charset parameter is defined for this registration. Adding one really has no effect on compliant recipients" + it makes responses bigger than necessary.

Any thoughts Rossen Stoyanchev Brian Clozel?

@spring-projects-issues
Copy link
Collaborator Author

Brian Clozel commented

Hey Sébastien Deleuze, do we know if that bug has been fixed in other browsers as well (Edge, Firefox, Safari and all the Webkit-based ones). I guess Chrome and others don't share the same browser engine anymore?
If we're confident about that, we could schedule that change for a future release, depending on the browser situation.

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

It is fixed in Firefox for more than 1 year ago, but indeed we need to test with Edge and Safari.

@spring-projects-issues
Copy link
Collaborator Author

Manuel Jordan commented

For Spring: What is the official list of Web Browsers to have a track control of this kind of bug.

My point, not sure if this bug is fixed or not in Vivaldi and SeaMonkey

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

After a test with Safari and this URL it appears that the bug is still present for at least Safari, so we need to keep current behavior. So I will turn this issue as a documentation one, to give more insight to users why we specify UTF-8 for JSON and not for XML.

@spring-projects-issues
Copy link
Collaborator Author

Manuel Jordan commented

Could you share please the github pull request?
I want see the changes.

Thanks

@spring-projects-issues
Copy link
Collaborator Author

Sébastien Deleuze commented

The related commit is this one, please note that you can see related commits with the links under the Development section at the right of that page.

@spring-projects-issues
Copy link
Collaborator Author

Manuel Jordan commented

Thanks for the link.

Yes, I forget about that, sorry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
in: web Issues in web modules (web, webmvc, webflux, websocket) type: task A general task
Projects
None yet
Development

No branches or pull requests

2 participants