Sesat > Docs + Support > Debugging > Logging, Logfiles, and Statistics

Logging

Skin logfile, eg generic.sesam.log

There exists a logfile for each individual skin. The logging here is from the Skin's ResourceServlet.
Upon initialisation of the ResourceServlet there's a series of useful WARN statements written listing exactly

  • modification timestamp,
  • allowed ipaddresses to restricted resources,
  • what is a restricted resource by extension type,
  • context path to resource type (by extension type), and
  • all WEB-INF/lib jar libraries.

Default daily rotation.

Sesat logfiles

An important part of SESAT is the logging functionality. All SESAT Kernel actions, choices, search results etc. are extensively logged for later parsing. An important part of SESAT future development is integration of statistics into the administrative portal.

Sesat's logfiles are written by Log4j and defined by sesat-kernel/war/src/main/conf/log4j.xml

The default configuration is daily rotation (sesam.access is the exception).

Two logfiles, sesam.marketing and sesam.sales, are not listed here as they are not yet in use.

SESAT produces several log files targeted to different users.

  • sesam.access
  • sesam.product
  • sesam.analysis
  • sesam.statistics
  • sesam.initialisations
  • sesam.dump

Default location of these logfiles is in your container's log directory.

Types of logfiles

sesam.access

This logfile contains the first log entry for all search requests, thus a good starting point for tracing a search request through the Search application. It partly mirrors the Apache access_log but splits the request up in two or three stages:
1. Initial log entry. This entry contains the actual http request parameters, in the <request>-element. Such as <url>, <http-referer>, <user> and <browser>.
2. Real-url. In case a pretty URL was submitted to the Search application, a second entry is logged. This time with a <real-url>-element.
3. Final log entry. Is logged after the search request has been handled by the Search application, when a response has been returned to the caller, and basically just list the http <response code="">.

Information to look for:

  • request_id: The unique request Id that is used in all the logfiles to identify the search request.
  • time: Timestamp for when the request was received and when it was done with.
  • skin: Which virtual host received this request.
  • IP address: Only used for geographical lookup, to tag searches with Geo location.
  • HTTP session Id: Only used for counting unique users, can be used for tracking user movements.
  • http referer: The HTTP-referer string.
  • user-agent: The user-agent string (identifying browser and platform).
  • response_code: The http response code.
  • query parameters: q (phrase), c (index; news, catalog, person etc), offset (>0 for browsing resultsets), output (rss, xml etc), newscase etc.
  • boomerang: Boomerang-ed events are logged here (newspaper view, company view, Ajax messages).
  • site-search: The site-search parameters (ss_ss, ss_lt etc) must be parsed from <real-url>.
  • mobile search: The mobile searches (originator, ua) must be parsed from <real-url>.
  • retriever: Ajax solution for logging paper archive clicks/reads (/search/writeLog.do?paper=...).

sesam.product

This logfile type is logged to when the Search application is through with the query server communication and knows how many hits the search got and which enrichment types should be returned. The following information can be read out:

  • request_id
  • timestamp
  • mode: The tab (aka vertical aka service aka tjeneste) for which the search request was submitted, e.g. mode="p" for picture search.
  • query: The search phrase.
  • enrichment size: The number of enrichments returned for this search request.
  • no-hits: If this appears in the log entry, it signals that no results were found for the given query and mode.
  • enrichment type: The actual enrichment types can also be read out from the log entry, not just the number.

sesam.analysis

Related to sesam.product but logs the underlying analysis scores for each query. Breaks the scores down to individual positive and negative hits for each predicate explicit in each analysis score.

sesam.statistics

This logfile type contains more technical information, such as a servlet's execution time and envoked search commands. The following data can be found:

  • request_id
  • timestamp
  • skin
  • query
  • servlet execution time: How long did the servlet take in processing the search request.
  • servlet output: If a special output was requested (e.g. output=rss).
  • search-command: One or more fields, containing the type of search-command that got executed by the servlet, the number of hits it returned and the time spent in executing the command.

sesam.initialisations

Loggers involved in core initialisation of the engine, typically SiteKeyedFactories, also log to this file.
Read this log file for startup and configuration deserialisation errors.
This logfile has no supported format.

sesam.dump

Logs a summary of every outbound http request made from sesat. This includes search commands, publishing fragments, and query evaluations.

 © 2007-2009 Schibsted ASA
Contact us