Quantcast
Channel: radikalFX » SEO
Viewing all articles
Browse latest Browse all 2

Tomcat and if-modified-since header

$
0
0

If you ever had to do something with web pages that need to be ranked the highest as possible, then you’ve probably read through the Google Webmaster Guidelines. But when I did that, I came across the following line:

Make sure your web server supports the If-Modified-Since HTTP header. This feature allows your web server to tell Google whether your content has changed since we last crawled your site. Supporting this feature saves you bandwidth and overhead

Because Tomcat’s default is to let the browser / crawler know that it doesn’t cache pages, GoogleBot will go and download the page every time it visits that page. In general this doesn’t have to be bad, but because the website I am working on contains a lot of pages in google (>500.000) it might be wise to sometimes give this if-modified-since header back to Google and thereby spare some bandwidth and server load.

So then we need to take a look if GoogleBot gives back this header and then compare it and see if the information should be re crawled by Google. For this example I will be taking a period of 7 days. Of course, you could also let this depend on the data in your system. But the website I am working on makes it a bit difficult to do this, because looking up if there is new data, will slowdown the entire load process of the website (for each request), which is something I would like to avoid.

So how does the code look, to get this done:

<%
String ifModifiedSince = request.getHeader("if-modified-since");
if (ifModifiedSince != null && !ifModifiedSince.equals("")) {
  try {
    // Format should be something like this: Thu, 10 Jan 2008 09:20:50 GMT
    SimpleDateFormat sdf = new SimpleDateFormat("EEE, d MMMM yyyy HH:mm:ss z");
    Date modifiedDate = sdf.parse(ifModifiedSince);
 
    Calendar modifiedCal = Calendar.getInstance();
    modifiedCal.setTime(modifiedDate);
    Calendar nowCal = Calendar.getInstance();
    long sevenDays = 604800000;
    if (modifiedCal.getTime().getTime() + sevenDays > nowCal.getTime().getTime()) {
      response.sendError(HttpServletResponse.SC_NOT_MODIFIED);
    }
  } catch (Exception e) {
    e.printStackTrace();
  }
}
response.setHeader ("Cache-Control", "max-age=604800"); // 7 days in seconds
response.setDateHeader("Expires",
System.currentTimeMillis()+604800000); // 7 days in milliseconds
response.setDateHeader("Last-Modified", System.currentTimeMillis()); // Now
%>

The code above should be enough to get the job done. But if you feel that this can be done better or easier, feel free to comment on this post.


Viewing all articles
Browse latest Browse all 2

Trending Articles