Issue
I have been developing a Java web app that simply takes first_name
, middle_name
and last_name
parameters via an HTML
form and then embeds that data into an XML file and responds back to the client.
I set the Content-Type: text/xml
.
Here is my servlet code:
package com.adi.request.xml;
import java.io.*;
import javax.servlet.ServletException;
import javax.servlet.http.HttpServlet;
import javax.servlet.http.HttpServletRequest;
import javax.servlet.http.HttpServletResponse;
public class RequestToXMLServlet extends HttpServlet {
private String lastName;
private String firstName;
private String middleName;
/* Request Handling... */
@Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
setName(request); // Initialising the firstName, middleName and lastName
String xmlDoc = getXML(); // Build and recieve the XML output
response.setContentType("text/xml"); // TO BE NOTED...
try(PrintWriter writer = response.getWriter()) {
writer.print(xmlDoc); // Printing the XML output
writer.flush();
} catch(IOException e) {
e.printStackTrace();
}
}
// Setting the firstName, middleName and lastName
private void setName(HttpServletRequest request) {
firstName = request.getParameter("first_name");
lastName = request.getParameter("last_name");
middleName = request.getParameter("middle_name");
}
// Building the XML output
private String getXML() {
// The append() methods just adds a \r\n at the end of every line.
String xmlDoc = append("<?xml version=\"1.0\" encoding=\"utf-8\"?>")+
append("<Request>")+
append(" <FirstName>"+firstName+"</FirstName>")+
append(" <MiddleName>"+middleName+"</MiddleName>")+
append(" <LastName>"+lastName+"</LastName>")+
append("</Request>");
return xmlDoc;
}
private String append(String str) {
return str + "\r\n";
}
}
The HTML form:
<!DOCTYPE html>
<html>
<head>
<title>Request to XML - Servlet</title>
</head>
<body>
<form method="GET" action="Request.do">
<label for="first_name">Firstname:</label>
<input type="text" name="first_name" id="first_name" />
<br>
<label for="middle_name">Middlename</Label>
<input type="text" name="middle_name" id="middle_name" />
<br>
<label for="last_name">Lastname</Label>
<input type="text" name="last_name" id="last_name" />
<br>
<input type="submit" name="submit" value="GET" />
</form>
</body>
</html>
This works fine and my browser properly displays the XML
formatted data.
The problem is that
I wrote a small jython
app that makes an HTTP POST
request using raw sockets to the above written Java Servlet
. Though it recieves proper XML formatted data, it also recieves unwanted characters at the begenning and end of the actual required XML data.
Here is my jython
code:
from java.io import *
from java.net import *
from java.util import *
sock = Socket("localhost", 8080)
ostream = sock.getOutputStream()
writer = PrintWriter(ostream)
params="first_name=Aditya&middle_name=Rameshwarpratap&last_name=Singh"
writer.print("GET /RequestToXML/Request.do?"+params+" HTTP/1.1\r\n")
writer.print("Host: localhost:8080\r\n")
writer.print("Connection: Close\r\n")
writer.print("\r\n")
writer.flush()
istream = sock.getInputStream()
scanner = Scanner(istream)
while(scanner.hasNextLine()):
print(scanner.nextLine())
istream.close()
ostream.close()
scanner.close()
writer.close()
sock.close()
The output of this code is:
HTTP/1.1 200 OK
Server: Apache-Coyote/1.1
Content-Type: text/xml;charset=ISO-8859-1
Transfer-Encoding: chunked
Date: Thu, 16 Jul 2015 18:46:37 GMT
Connection: close
bc // What is this?
<?xml version="1.0" encoding="utf-8"?>
<Request type="POST">
<FirstName>Aditya</FirstName>
<MiddleName>Rameshwarpratap</MiddleName>
<LastName>Singh</LastName>
</Request>
0 // And this?
So my questions are:
What are those characters and why are they even sent when the content type is
text/xml
?This is irrelevant, but still, in my jython code, I've closed all the streams and socket at the end of the code. Is it necessary to close all of them or a few of them would do the cleanup job?
Solution
It are chunk lengths in hex. Look, the response body is being sent in chunks as per below header:
More detail about this transfer encoding can be found in Wikipedia. The line with bc
indicates start of a chunk of 188 bytes long (0xBC = 188). The line with 0
indicates the terminating chunk (so the client knows it can stop reading and don't need to wait for new chunks with remaining content, in case the connection is set to keep alive).
The servletcontainer will automatically switch to chunked encoding when the content length is unknown and the client has identified itself as a HTTP 1.1 capable client. It's even explicitly mentioned in javadoc of doGet()
:
...
Where possible, set the
Content-Length
header (with theServletResponse.setContentLength(int)
method), to allow the servlet container to use a persistent connection to return its response to the client, improving performance. The content length is automatically set if the entire response fits inside the response buffer.When using HTTP 1.1 chunked encoding (which means that the response has a
Transfer-Encoding
header), do not set theContent-Length
header....
Your client is not written in such way that it's capable of consuming chunked responses. It's basically an extremely basic socket which is in the request header pretending to be a HTTP 1.1 client.
If it's not affordable to rewrite the client in such way that it can deal with it (at least try pretending as a HTTP 1.0 client), or to switch to a real 1.1 HTTP aware client (in Java terms, that would be e.g. URLConnection
), then rewrite your servlet in such way that it sets the content length.
@Override
public void doGet(HttpServletRequest request, HttpServletResponse response) {
// ...
String xmlDoc = getXML();
byte[] content = xmlDoc.getBytes("UTF-8");
response.setContentType("text/xml");
response.setCharacterEncoding("UTF-8");
response.setContentLengthLong(content.length);
response.getOutputStream().write(content);
}
If you're not on Java EE 7 / Servlet 3.1 yet, and you can guarantee that the XML content is not larger than Integer.MAX_VALUE
(2GB), then use
response.setContentLength((int) content.length);
or if you can't guarantee that, then use
response.setHeader("Content-Length", String.valueOf(content.length));
Note that it must represent the byte length and thus certainly not the character (string) length. Also note that you don't need a try-with-resources statement. The container will all by itself worry about flushing and closing.
See also:
- Java Servlet HttpResponse contentLenght Header
- Do I need to flush the servlet outputstream?
- Should I close the servlet outputstream?
Unrelated to the concrete problem, your servlet is dealing with instance variables on a per-request basis. This is not threadsafe. Move those instance variables to inside the method block. For more detail, see also How do servlets work? Instantiation, sessions, shared variables and multithreading.
Answered By - BalusC