Dr. katz's closet - Of Man, Women, Life and Computer Science: 2008

Wednesday, December 17, 2008

Agilefant : Is it SPRING or what?

Suppose you have a web page.
Suppose the web page is a SCRUM management web application.
Suppose it's called Agilefant, made by the guys over at TKK, Helsinki's Technical Highschool / University. Here is the project page
http://www.agilefant.org/wiki/display/AEF/Agilefant+Home.

Now suppose we would like to integrate the most important information of this very usefull application into our Information Radiator. How would we do it?
We would have to handle a few minor issues first, like access for an example. We don't want to just "remind" the host that would connect to our Agilefant server with the available option. We want to create a new session every time we need to access it.
Agilefant is based on Spring(http://www.springframework.org). Spring is an application framework. Google it if you want more information, I am surely not crazy enough to dive into it's deepness - and trust me, it's HUGE.
But anyway. Back on track.
Spring, in return, bases its authentication mechanism on Acegi (http://www.acegisecurity.org), Spring's standard authentication sub-framework. Now, Acegi supports various methods of authentication (Form, BASIC, NTLM being the most important ones, but it offers WAY more), but Agilefant provides you only with form-based access, which in cases like ours where you need to leave the door open for automated crawling can be a bit restrictive.

How do we overcome this restriction? Well, Spring handles the authentication through the use of filters, which basically redirect you depending if you supply correct or incorrect credentials. In case the authentication is successfull, the response will contain a JSESSION id in form of a cookie, which hast to be used for later accesses. Which means that we first need to perform a login, fetch the cookie and then fetch our Agilefant informative pages (by providing the cookie) - for example a burndown graph. Since (as mentioned) Agilefant supports only form-based authentication, we have to handle the whole issue at HTTP level.
And here is where the php_http extension for php proofs to be extremely handy. In fact, this library gives you full control on HTTP requests / responses, providing functionality for HTTP requesting / receiving / marshalling and unmarshalling. Wow. You could build a browser in PHP with those functions (has anyone ever been crazy enough to think of that anyway?).

So now the steps to perform (at HTTP level through PHP) are as follows :

a) Supply to the authentication target of Agilefant a valid username/password pair
b) fetch the cookie from the response
c) fetch our desired Agilefant data with the previously obtained cookie

Which, translated to PHP, comes down to a very simple script that looks like this:

header('Content-type: image/png');
$params = array ("j_username" => "yourusername", "j_password" => "yourpassword");
$response = http_post_fields("http://agilefant/agilefant/j_spring_security_check", $params);
$rHeader = http_parse_headers ($response);
$cookies = http_parse_cookie ($rHeader['Set-Cookie']);
$options = Array ('cookies' => $cookies->cookies);
$response = http_get("http://agilefant/agilefant/drawChart.action?iterationId=38", $options);
$rBody = http_parse_message ($response);
echo $rBody->body;

Thursday, December 11, 2008

HTTP Header faking - OR: how to login to a page with PHP

How do I login to a (secured) login page via PHP? It's possible, apparently. What we need is to set up the correct HTTP request, by setting the header and field values.


function do_post_request($url, $data, $optional_headers = null)
{
   $params = array('http' => array(
                'method' => 'POST',
                'content' => $data
             ));
   if ($optional_headers !== null) {
      $params['http']['header'] = $optional_headers;
   }
   $ctx = stream_context_create($params);
   $fp = @fopen($url, 'rb', false, $ctx);
   if (!$fp) {
      throw new Exception("Problem with $url, $php_errormsg");
   }
   $response = @stream_get_contents($fp);
   if ($response === false) {
      throw new Exception("Problem reading data from $url, $php_errormsg");
   }
   return $response;
}

This function (found at
http://netevil.org/blog/2006/nov/http-post-from-php-without-curl)
gives me a PHP entry point. But how do we actually setup the header - what do we need to fake the Session to believe a User inputted some stuff in the login page?

Thursday, November 27, 2008

JMX-HTTP + MBeans exploration = STRUTS-enabled CC

So now I found this nifty JMX-HTTP adapter, which enables for an HTML view of MBeans interface's! Victoly! You would say, BUT! the application doesn*t seem to work as it should.
It's STRUTS-based, and makes use of custom tag libraries. It came as a WAR file, so I had to extract it (no problem at all) to the CC server. Now, when I navigate to the site, I get the following error :

Failed to load or instantiate TagExtraInfo class: com.cj.string.StringVariable

Where is the problem, you might ask - good question. The problem is of course in Cruisecontrol : it comes with its own JMX server. Now, the CC control panel is based on MBeans. We can then assume that CC's JMX server has NOT support for STRUTS.
So what I could try to do is to enable the STRUTS on the CC JMX server; which I don't know if it is actually possible, so I will first of all try to gather more information about the CC webserver.

Monday, November 24, 2008

Information Radiator #4 - The deeps of EJB's

Who the hell uses EJBs anyway.
I mean, we have Webservices. They handle things so well. Why oh why use EJBs..Bah.

Anyway, having thrown away the idea of direct accessing webpages from the IR, I will do so by fetching them from specific PHP adapter pages. This approach has two main positive things:

a) security is (probably) more manageable
b) Collecting information and then displaying it in my own page and own style is always nice

So there is this page that we want to have, which should display in a very simple way the status of the current running Cruise Control projects. Not considering the fact that we have three different CruiseControl servers in here, I am currently researching how to obtain the status of the single projects through the bean interface. Easier said than done, I never used beans since University, and even there they only mentioned them vaguely (yes really!).

Thursday, November 20, 2008

Information Radiator #3 - the judgement!

So now after testing the system I have come up with I discovered three things :

Mozilla and IE show totally different behaviour when fetching from remote pages (as is non-localhost pages)
IE has a problem with XML Dom attributes for some strange reasons
Images are not displayed correctly (because of relative paths in the page sources)

The next step will then be to first fix the IE XML bug, and then fix the Mozilla remote pages fetching. Something tells me the reason for the problem may be that IE's Javascript objects (in contrary to Firefox) may not be persistent..

The XML Dom problem for IE was basically an (stupid) incorrectly checked unasigned.

As for the remote page fething, After some investigations, I came up with some hints :
a) Cross-domain fetching does not always work - Firefox has it disabled by default and fails the fetch, IE can enable it but will show a dialog asking for user confirmation when it tries to fetch the page.
b) It doesnt seem to work from Firefox even if I am in the same domain. IE behaves somehow strange : if I add the site to the "trusted" site's list, it denies acces to the page. Otherwise it prompts for confirmation and -WOW- works..But the refreshing of the IE page seems to have some troubles. This is not a primary goal (to get it to work on IE) since it will be displayed on an Opera browser anyway so I'll skip this for now.

There is a way to enable cross-domain fetches in Firefox (apparently Firefox doesn't recognize Windows being part of a Domain if you don't log on with a domain username..). Firefox has a run-time privilege structure for scripts (based on the old Netscape's Communicator one). By adding a call to the following function in a Javascript code :

netscape.security.PrivilegeManager.enablePrivilege()

the script will be granted the privilege if:

a) the signature of the script is valid
b) codebase principal are enabled

Intresting page for prefs.js customization :
http://www.zachleat.com/web/2007/08/30/cross-domain-xhr-with-firefox/

The privileges that you can enable at runtime are :

`UniversalBrowserRead`	Reading of sensitive browser data. This allows the script to pass the same origin check when reading from any document.
`UniversalBrowserWrite`	Modification of sensitive browser data. This allows the script to pass the same origin check when writing to any document.
UniversalXPConnect	Unrestricted access to browser APIs using XPConnect.
`UniversalPreferencesRead`	Read preferences using the `navigator.preference` method.
`UniversalPreferencesWrite`	Set preferences using the `navigator.preference` method.
`CapabilityPreferencesAccess`	Read/set the preferences which define security policies, including which privileges have been granted and denied to scripts. (You also need UniversalPreferencesRead/Write.)
`UniversalFileRead`	window.open of `file://` URLs. Making the browser upload files from the user's hard

Wednesday, November 19, 2008

Information radiator part #2 - Javascript design + AJAX = CRAWLER!

So now that it's clear how to actually realize or classes, we can step back once and dig ourselves deeper into the design (again).
What I want is to share one single XMLHttp object over multiple requests. That is, we need to configure the object request with the correct (request, eventstateready function) for every request.
So we will need a way to couple the XmlHttp object with a readystate handler, for an example :

XMLHttpFactory.prototype.setReadyStateChange = function (onReadyStateChange) {
this.__xmlObj.onreadystatechange = onReadyStateChange;
}

Another thing to consider is the request itself : different requests require fetching of different server or static HTML pages for working (which will generate the result, this is a basic part of the AJAX technology which is not mentioned often enough as it should imo).
For sending a request to the Server, the XMLHttp object provides a basic "open" method for preparing the request, as well as a specific "send" method for forwarding it to the server. For an
example :

xmlHttp.open("GET","time.asp",true);
xmlHttp.send(null);

So let's add a specific method "submitRequest" in our Javascript prototype:

XMLHttpFactory.prototype.submitRequest = function (nature, target, onReadyStateChange) {
this.__xmlObj.open (nature, target, onReadyStateChange;
this.__xmlObj.send (null);
}

Our "submitRequest" method expects the kind of submitting (nature, "POST" or "GET"), the target server page, and the onReadyStateChange event handler. Notice that this makes the onreadyStateChange method we defined earlier unneeded. Thus I removed it.

Next is a class for managing the sequence configuration file (sequence & timings logic) and the configuration itself.

The configuration is represented in xml format to increase readability and understandability.
The various "display" tags represent a the single page that will be displayed. In this case "test1.php" will be fetched and displayed for 10 seconds and then it will be "test2.php"'s turn.

Considering the fact that we want to actually show webpages from different sources, we will first need to fetch them and then process them. Following this mindmap, we can come up with a straightforward structure :

on the bottom, a transport level which takes care of the XMLHttp handlign
on top of it, a presentation level, which displays pages based on the configuration file we feed it with

Easy.
The transport level is very basic. It's interface includes :

a target URL to fetch
a target element to insert the fetched webpage into

What I've come up with is (the code is pretty much self-explanatory, so I won't get more into detail):

//Class prototype for the XMLHttp transport facility.
function XMLHttpTransport(){
try
{
// Firefox, Opera 8.0+, Safari
this.__xmlObj = new XMLHttpRequest();
}
catch (e)
{
// Internet Explorer
try
{
this.__xmlObj = new ActiveXObject("Msxml2.XMLHTTP");
}
catch (e)
{
try
{
this.__xmlObj = new ActiveXObject("Microsoft.XMLHTTP");
}
catch (e)
{
alert("Your browser does not support AJAX!");
return false;
}
}
}
}

//Variable for XMLHTTP Object
XMLHttpTransport.prototype.__xmlObj;
//Variable for Item Id - attached to xmlObj!
XMLHttpTransport.prototype.__item;

//Request submitter
XMLHttpTransport.prototype.submitRequest = function (nature, target, item) {
this.__xmlObj.__item = item;
this.__xmlObj.__target = target;
this.__xmlObj.onreadystatechange = this.ReadyStateChange;

try {
this.__xmlObj.open (nature, target, true);
}
catch (e) {
alert ("XMLHTTP Open Didn't work as expected! Exception : " + e.description);
}

try {
this.__xmlObj.send (null);
}
catch (e) {
alert ("XMLHTTP Send Didn't work as expected! Exception : " + e.description);
}
}

//Static event handler for basic display - assigns an element its inner html! - xmlHttpTransport is used as reference to the xmlhttp object. the context of invocation is from the xmlhttpobject, not xmlHttpTransport
XMLHttpTransport.prototype.ReadyStateChange = function (){
switch (xmlHttpTransport.__xmlObj.readyState)
{
case 0:
elText = "";
break;
case 1:
elText = "Request ready";
break;
case 2:
elText = "Request sent";
break;
case 3:
elText = "Processing request..";
break;
case 4:
if (xmlHttpTransport.__xmlObj.status == 200)
elText = xmlHttpTransport.__xmlObj.responseText;
else
elText = "Page Not Found: " + this.__target;
break;

}
document.getElementById(this.__item).innerHTML = elText;
}

While for the presentation level we will have :

//-----------------------------------------------------------------------------------------------------
//Timed execution
function SequenceConfig(configData){
//Loads the document
try
{
//All other browsers
parser = new DOMParser();
this._xmlDoc = parser.parseFromString(configData, "text/xml");
}
catch (e)
{
try
{
this._xmlDoc = new ActiveXObject("Microsoft.XMLDOM");
this._xmlDoc.async = "false";
this._xmlDoc.loadXML(configData);
}
catch (e)
{
alert('No support for DOM in your browser?');
}
}
this._currentDisplay = 0;
}

//Configuration properties
SequenceConfig.prototype._xmlDoc;
SequenceConfig.prototype._currentDisplay;

SequenceConfig.prototype.displayNextPage = function (xmlHttpTransport){
//Initialize if needed

if (this._currentDisplay == -1)
{
el = this._xmlDoc.documentElement.firstChild;
this._currentDisplay = this._get_nextNode (el);
}
//time next display!
t = setTimeout("config.displayNextPage (xmlHttpTransport)", this._currentDisplay.getAttribute ('time') * 1000);
//Display it!
xmlHttpTransport.submitRequest ("GET", this._currentDisplay.getAttribute ('source'), "root");
//get next page
this._currentDisplay = this._currentDisplay = this._get_nextNode (this._currentDisplay.nextSibling);
}

SequenceConfig.prototype._get_nextNode = function (node){
x = node;
if (x == undefined)
{
return -1; //Stop if there are no more nodes
}

while (x.nodeType != 1)
{
x = x.nextSibling();
alert (x);
if (x == undefined)
{
return -1; //Stop if there are no more nodes
}
}
return x;
}

The configuration file is loaded into a dedicated variable at statup with help of some PHP code snipped (which loads the content of the configuration file and removes the newlines) :

//Configuration data will be included through a server-side include directly to the variable.
var configData = '';
var xmlHttpTransport = "";
var config;

So all what is missing now is a Javascript initiator (a method to start the whole process), which will be executed when the page loads (in the onLoad event of the "body" tag).

function init()
{
xmlHttpTransport = new XMLHttpTransport();
config = new SequenceConfig (configData);
config._currentDisplay = -1;
config.displayNextPage (xmlHttpTransport);
}

Easy eh? ;)

Tuesday, November 18, 2008

Information radiator - Javascript a Go-go

Well, looks like we want to improve the Scrum in our company.
We need a nail-new information radiator.

An Information radiator is actually a whiteboard, or a big piece of paper, even better a monitor, sometimes also a true traffic light! and its meaning is to provide status information on the hot projects running in a development team, in a packed, essential and easily understandable form.

You could compare an information radiator to an information cache. CruiseControl's main page is a good example of information radiator.

The nature of the single sources from where the informations that have to be displayed are fetched can be totally inhomogeneousm. In our case, for an example, we want to get our latest burndown graph displayed, the results of the currently run tests, analysis tools results, bug graphs, current build status, and even coffee machine status, or toilet paper-o-meter.

So what we actually want, is kind of an information center about the activites of our company, which should display status of X projects in a choosen format and a specified sequence.

Especially it should be minimalistic. Totally minimalistic. Showing only the essential, that is. For an example, we are not intrested in showing any log data in it. Or error / warning messages that occoured through the build process, even execution information would be skipped for an example. That kind of detailed information can be fetched from the primary source of information, for example the builder /compiler reports themself (in our case the CruiseControls project status pages). The information radiator only resembles a quick view of what is going on in our company.

Now that the idea is clear - how do we get this beast together?

The ratio of the above is :

We want to present information fetched from inhomogeneous sources in a compact, distilled and easily understandable form on a single medium easy-accessible medium.

There will thus be a need for

a) homogenization of information (from the different sources)
b) presentation of the homogenized information
c) last but not least : scalability

Suppose our company has 1000 projects going on, with 100000 employees, 34000 test cases and 3000 different test tools. We need a system which could merge the relevant status information from all of the source that have been included in the display cycle on one single information radiator.

The first thing that comes to my mind is CORBA, for some strange reason. It is scalable, customizable (thus can homogenize different information sources), and can be supplied with a good presentation logic quite easily. The reason why I thought of CORBA is actually that the behaviour of our information radiator shows up many parallels with the behaviour of CORBA
applications.

But of course, we don't have CORBA nor do we plan to use it. Our needs can be easily fullfilled with simple webpages sa well. We will "borrow" some of CORBA's paradigm pieces to find an easy solution, for an example by separating presentation logic from the information sources through the use of adapters.

Im a big fan of XP, so since I have enough design to start with, I'll start right away.

First I'm going with the presentation logic (the easy part, and the one that gives more satisfaction :) ). The idea is to use Ajax to load the pages from a specified location, and Javascript to refresh the page's content. So first we are looking at the timed events in javascript.

Javascript supports timed events through two methods :

setTimeout
clearTimeout

Since the presentation logic will be based on Javascript, this is a good occasion to dive into Javascript's classes - which I haven't been doing since I was at university.

Javascript is very powerfull. But that we know already. Fact is that Javascript doesn's support classes. Java is a language for prototype-based object modeling. Everything in Javascript is an object. Everything. Every object in Javascript has a so-called "prototype", which can be raffly compared to the generic class definition of common OOP languages like Delphi, C#, C++, and (yes sadly) VB. The objects in javascript are dynamic objects, which means that you can add new properties, methods, or anything you like at any point of execution to any of the objects you have defined. object's prototype. Confusing, ha? But nifty. This actually gives you a total freedom about structure control. On the other hand, the definition of a "class" in the classic sense is a bit different in Javascript. For instance, as I told you already, classes need to be defined differently than from other programming languages, starting from an object's prototype, more than from a single (static) class definition. What this means practically is :

a) you will find no "class" keyword in Javascript or whatsoever
b) The most common way to define a class is to construct it starting from an object's prototype

Point a) is pretty much straightforward. Point b) is the obscure part.
But just think: if anything in Javascript is an object, and thus has a prototype, we can construct a class out of everything. The most common way is to use a function prototype. In our case, for an example, I want to make use of the Ajax technology for updating the slideshow on the Information radiator. I will thus need a class to group this kind of behaviour, and of course I need some code (a function) that initializes the XmlHttp Object (depending on the browser etcetcetc..). So suppose our function will look like this :

function XMLHttpFactory()
{
try
{
// Firefox, Opera 8.0+, Safari
this._xmlObj = new XMLHttpRequest();
}
catch (e)
{
// Internet Explorer
try
{
this._xmlObj = new ActiveXObject("Msxml2.XMLHTTP");
}
catch (e)
{
try
{
this._xmlObj = new ActiveXObject("Microsoft.XMLHTTP");
}
catch (e)
{
alert("Your browser does not support AJAX!");
return false;
}
}
}
}

This function will not only instantiate the XmlHttp Object but also be our "spinal" for the XMLHTTPFactory class. The "self" item references the object's prototype.
In fact, we can add more behaviour to the function's prototype by adding a (private) variable for storing the reference to the XmlHttp object:

//Variable for XMLHTTP Object
XMLHttpFactory.prototype.__xmlObj;

//Getter
XMLHttpFactory.prototype.getXmlObj = function () {
return this.__xmlObj;
}

In the same way, we want to provide more functionality for our XMLHTTP factory to react to different requests issued to the server in specific ways.

Tuesday, November 11, 2008

MS VC++6.0 Custom Build anyone?

Have you ever been fiddling around with custom build steps in Microsoft Visual Studio?
Lots of people find them totally usefull. Others think they are merely another lock-in for Microsoft customers.

Custom build rules are THE solution for integrating compiling of legacy source codes such as assemblers, image generators, scripting engines and legacy tools (for example for md5 computing) and so on in one single project (Microsoft's dev env, that is). They are fast, but not straightforward. Well, they ARE actually in VS2005, but not in VC++ 6.0..Don't ask.

Custom build rules are expressed with batch script. Or so it seems. So far, so good you say, if it wouldnt be for only one problem, and that is : it uses obscure macros in practically all configurations. Well, the names of the macros help you alot with understanding the meaning of each of them of course (for an example $(InputDir)), but still relying on your immagination isn't always a correct thing to do.
Echoing them out with a plain batch "echo" doesnt seem to be working as it shows up nothing (at least not by compiling the file in the environment itself), and I am too lazy to try it on the stdout / stderr.

Got an idea since they are project-dependent they probably are stored in the project configuration files. If you open the .dsp file with an editor, you will find them defined there - well, most of them.
But where / how are they defined? And when? How can we change them?

A quick google on them shows up no satisfactory result. It's probably a too old topic, or I am just blind and don't see it.

This article from MS shows a complete list of the macros. Took some time to find it ;).

Here is a brief list of them :

Label	Macro	Description
Intermediate	$(IntDir)	Path to the directory specified for intermediate files, relative to the project directory.
Output	$(OutDir)	Path to the directory specified for output files, relative to the project directory.
Target	$(TargetDir)	Fully qualified path to the directory specified to output files.
Input	$(InputDir)	Relative path to the project directory.
Project	$(ProjDir)	Fully qualified path to the project directory.
Workspace	$(WkspDir)	Fully qualified path to the project directory.
Microsoft Developer	$(MSDevDir)	Fully qualified path to the installation directory for Microsoft Visual C++.
Remote Target	$(RemoteDir)	Fully qualified path to the remote output file.
Target Path	$(TargetPath)	Fully qualified name for the project output file.
Target Name	$(TargetName)	Base name for the output file.
Input Path	$(InputPath)	Fully qualified name for the input file.
Input Name	$(InputName)	Base name for the input file.
Workspace Name	$(WkspName)	Name of the project workspace.
Remote Target Path	$(RemoteTargetPath)	Fully qualified name for the remote output file.

Friday, November 7, 2008

HANGOVER

is a shit.

Thursday, November 6, 2008

xslt & Cruisecontrol #2 - how to get HTML code straight to the Cruise control log!

I was just wondering if there is a way to highlight specific texts in the Cruisecontrol tests result display panels...
Of course, I think adding a simple html font tag / color tag to the test assertion text would do the job, but on the other hand, as it is logged, the plain text would be echoed on syslog as well. Which would on the other hand make it unreadable, or hardly readable.

So what we need to know is how nosetests / nosexunit processes the test result.
A quick nosetests --help shows us that nosetests features the following logging facilities (see under the -l / --debug switch):

nose
nose.importer
nose.inspector
nose.plugins
nose.result
nose.selector

nose is the root logger, nose.result could be the result producing log (we are not sure yet!)

Also, we know that the result of the test gets echoed to Syslog by our homegrown syslog plugin (which I fixed some posts ago). This happens thanks to the implementation of the formatFailure function, which overrides standard behaviour and formats the output by fetching the cause from the last exception (remember, assertion failures raise exceptions!).

A quick test on our Cruisecontrol host shows that the test's text results (which are merged by the Cruisecontrol agent at the end of the build cycle in the Cruisecontrl log test pages) are displayed as plain text. That is, HTML commands are uninterpreted. How comes?

The answer is simple : the text that is outputted by nosetests is interpreted by nosexunit as plain text, that is "<" and ">" signs are not interpreted as xml tags but as simple characters. Thus, when they get merged into the Cruisecontrol presentation panel through the XSLT transformations, those signs are converted in the & lt; and & gt; placeholders. That is, the "<" and ">" are still interpreted as plain text.

Specifically, this is the result of a merged nosexunit output (message is cutted down due to size constraints) :

<![CDATA[The test button <p style="color:red">'button 4'</p> has not .....;]]>

So it is actually merged as CDATA (obviously). The whole can be arranged with a small change in the XSLT transformation docs : check the \cruisecontrol\webapps\cruisecontrol\xslt folder, it contains all of the XSLT transformation schemes.
The one you are looking for is the errors.xsl. But first open the buildresults.xsl, and make sure that the entry which includes the errors.xsl is uncommented. You can tell so if by navigating to a build's log (by navigating on the project's name link) the Errors / Warnings section is displayed on the right part of the screen.

In the errors.xsl file you have to look for the match="message[@priority='warn']" pattern, and add the disable-output-escaping="yes" flag to the value-of instruction.

Wednesday, October 29, 2008

Victoly times two!

Now I just found out that its only necessary to alter the "score" in the setup.py file of the syslog plugin. Peeeeeh. What you don't learn from nosetests eh? :P

nosetests who? Or : how winPDB p0wnz us all...

Victoly!!!
Hell yes!
It took almost three weeks, but in the end it was worth it.
You see, nosetest has a small problem with plugins. How did I get to that conclusion?
Well, after finding out that nosetests actually is not only an executable but a library that can be invoked by python code as well (!), and after having found an awesome windows - python debugger : winPDB, a simple debugger with the power of python. You can imagine what comes out of it.

So I made a small python script that includes the nose module (nosetests.py) :

import nose
run()

Complicated, huh?
This small snipped executes nose with the commandline that I passed to the script.
So once this was done, I could run it in winPDB with a commandline like this:

winpdb nosetests.py --sysloghost=localhost --with-nosexunit --xml-report-folder=c:\test test_mytest.py

This opens up the winpdb debugger. After some tracing, I found what I was looking for:
in the run method of the TextTestRunner class, the plugins are finalized with the following call: self.config.plugins.finalize(result), forelast line of the method. The call redirects to the various plugin finalization methods. The plugin we are using is handled by the standard plugin manager, PluginManager. In short, the finalize method redirects the call to a simple method, which processes the finalization of the plugins until one of those finalizations returns a value.
Our homegrown syslog plugin passed the received result on to the nosexunit manager, which of course was not null. Nonenthless to say, the nosexunit plugin was always executed after the syslogplugin. Forcing the syslog plugin finalization method to return null resolved the problem. Turns out that because of the plugin finalization manager executing first the finalization of the syslog plugin the nosexunit finalization was not executed, thus the output was not produced. I wonder if this is a bug?

Now, I don't know the following things:

How nosetests handles plugin loading / unloading with priority if it handles finalization with priority at all
If it was a good idea to forcingly set the result to "None" in our syslog utility finalization's method.

Considering the fact that the method in this case is executed as a final one, probably point 2) is not very relevant. But how does nosetests actually handle plugin priorities? Seems like it assigns to each of them a score, which is then used to decide which one gets loaded first and so on.
Look what I found crawling through google :

- Added score property to plugins to allow plugins to execute in a
defined order (higher score execute first).

This comes from the changelog of version 0.10.0a1. So it seems the problem is not totally new (damnit!). So it seems our search is not over.

Monday, October 27, 2008

nosetests - neverending story

nosetests, nosetests, nosetests!

This is getting hilarious..BUT I want to understand why it's not working. Damnit!
On that, it showed up to WORK actually every now and then (for an example if ANT calls it). Why would this be? I don't know. I need rest, so I am pushing this back in the closet for now.

Friday, October 24, 2008

nosetests part 2 - the final solution

Ok, some more freetime for me today! Yay!
Now I have actually time to investigate some more about the nosetests - python vs CruiseControl log output feature.
Let's see, where did we leave? Hm, by trying to get some proper xml formatted output for the Cruisecontrol, we discovered that if Syslog plugin was used as well, the output would not had been formed (for some strange reasons). The problem arises from the fact that we use syslog to "log" internal test activity (for an example, when a test starts and ends, to log exceptions etc).

Main reasons for that include :

xml logging targeted messages redirected to syslog
syslog interfers and breaks xml logging

Let's check the first one.

Xml formatted output should be produced if the switch "--with-nosexunit" is present on the command line.
So let's perform the two test cases in here:

nosetests --with-nosexunit --xml-report-folder=c:\temp test_mytest.py,

which produces a TEST-test_mytest.xml file in c:\temp and

nosetests --sysloghost=localhost --with-nosexunit --xml-report-folder=c:\temp test_mytest.py

which doesnt.

Now we have two effects in here, which push the problem into a somekind of "limbo" (no, not mambo): on one hand, we know for sure that the xml formatted output is not produced by nosexunit - The nosexunit library simply captures the stdio / stderr from the script it's running and outputs it accordingly to the commandline arguments it has obtained, so this means (probably) that no input is fed to the nosexunit (which is why, in turn, the output is not produced); we also know that the expected xml-formatted output created by nosexunit is not present in our syslog target. This excludes the (least likely) case in which the output is produced by nosexunit and gets (for some strange and impossible reason) to the syslog host.

This means that it gets lost somewhere in between. In between where? Well, nosexunit is a plugin as well. Could it be that the output to be formatted gets lost between the two plugins? To be able to answer this question, a few more tests need to be executed.

The syslog plugin which we use internally is home-made (obviously) and worked fine up to a few weeks ago. Well, it worked fine means "it never showed any problems". What does the plugin do - it adds a new listener to the logging instance, by doing the following :

self.logger = logging.getLogger("nosetest")
self.logger.setLevel(logging.DEBUG)

sysLogHandler = logging.handlers.SysLogHandler((options.sysloghost, 514))
formatter = logging.Formatter("%(name)-30s: %(message)s")
sysLogHandler.setFormatter(formatter)
self.logger.addHandler(sysLogHandler)

Obviously this piece of code interferes somehow with the nosexunit library.

Tomorrow? Maybe. Nah. Who cares! Of how Apache and througptut Speed changed our lives

Well, yes I admit it, definitively I've been not posting for four days now!

So today I had to do with Apache and PHP. Nice. Installed and configured. Easy. No big problem. Just the fact that the eAccelerator had a version misallignement.

But the real problem is : How do you test for throughput performance?
Well, you take a VERY big file, and send it with the application that is under test and register how long it takes. It's that easy. Ok, this is really not hard at all. BUT. What if you need to test the maximum throughput? How do you know you got it? You don't have a clue about how the application handles the streaming : buffered - unbuffered / paged - unpaged / UDP / TCP; and also, you don't know at which rate the network influences the transfer.

So all you can do is going by hypoteses, in the end. UNLESS you can have a look at the source code. Bleh :P.

Friday, October 17, 2008

Homework: Outlook, Indexes, PHP and Image rescaling!

I decided to add to my blog, erm, closet's title the following text :
"Of Man, Women life and Computer Science".

Ok, so I dropped by at my parent's place to say "hi" - haven't seen them in a while now, living far away doesn't make things much easier but that's really not the point here - This is a NERD blog, let's keep it on that level.
So I said I dropped by at my parent's, and discovered they had a problem with Outlook express 6 in the office (it's a very small office, just the two of them working in there with one pc). Apparently, the mails once being sent refused to leave the "Outgoing mail" folder, for preferring the "Sent mail" one. After a short test, I realized that mails actually were sent out, but not moved to the "Sent mail" folder.
I enabled POP3 / SMTP logging (hey btw, didn't really know there was such a nifty functionality in there!) by Flagging the "Mail" option in the "Maintenance" page on the "Options" menu, but everything was working of course, because the mailing wasn't affected! So I started thinking about what could be wrong with Outlook. I googled for the same problem and it turned up that there is actually a bug in the .dbx file management that corrupts your email databases (the .dbx files), and it stated that those data bases could easily show off some problems if for an example Outlook (or the whole Windows) would crash unexpectedly. It stated that compressing the folders would clean up deleted emails (apparently Outlook doesn't do that straight away) , as well as fixing those problems that aroused. Also, it stated that Microsoft supports files up to 2 GB. Well, my parents had a 2 GB and something archive file for the "Sent mail" folder. Was that the problem? Of course it was!
Double-checking it by dragging the mail that still was in the "Outgoing mail" folder and dropping it in the "Sent mail" was quick and confirmed the theory.

Now, after this was fixed, I spent some time with my father and he suggested I could fix up the company's homepage (literally "it sucks"). I promised to do so, and already camed up with some ideas. Let me introduce you more into it :
They are running a merchandising brooking agency, oriented to ski-schools and sports in general (mostly winter-related like skiing for an example).
They would like to have a homepage where they could show examples of what they managed to produce. Of course, they would like to be able to add more as time passes by, and remove some if there would be too much (the current hosting company are true thievs, they pay 80 € a year for 200 MB!! But that is another story).
The way to present the images would be contextualized, meaning that you would have a list of
contexts to choose from and depending on which list you would had chosen you would get a set of images of products. Of course, one product could be posted in more than a group (being that different categories of customers may be intrested in the same kind of products or in products with special attributes), so it should be possible for them to "link" images to a specific group.

In short, the rational is that you could set up one group for each category of customers, meaning a set of pictures of relevant examples with specific attributes which customers from the category are mostly sensible towards.

Of course, I would like to do all of this in AJAX, and do it *now*. I don't want to learn how to install / use (surely helpfull) frameworks like Drupal / Joomla / blablabla, this is a too small project for making it too complicated. I can make my own library. Yes I know, I am masochist.

So first thing I camed up with a small prior-art of functionality to get an idea on how needs to be done. You know, use-case oriented design and such. The basic functionality groups are the following

category management functionalities
picture management functionalities
grouping management functionalities

The first one groups the adding / editing / deleting categories, the second relates to uploading / downscaling / removing pictures, while the third one would group simple category / picture pairing logic functions like adding pictures to a group, or removing.

What do you think?>

The next step is trying to get the logic that lies behind the functionalities described above "factorized", that is reduced to the minimal logic that is shared by all functionalites and that covers all the functionalites required by our use-cases. In geometry the same concept is referred to as a "base", meaning the minimal set of vectors which can produce all other vectors (all the vectors that are linearly dependent, that is), or a set of vectors which are linearly independent towards eachother. So what we are looking for now, following the "geometry" point of view, are the "basic logical units" which all the others (required by the functionalities) are made from.
Of course, this is not as easy as it sounds. And surely can't be achieved in one single step. Let's try. A first (very) high - level grouping is the following :

a) Working with lists - meaning adding items, removing them (in all three of the cases)
b) Editing contents - this could actually fit in point "a" as well, but considering that what we edit
here are set of attributes where each one mainly inherit from a specific nature (categories, pictures, groups in our case) it may be some of them requires some additional / specialized (or original) unshared functionality. Well to be honest there are two ways of handling this. The first way (the way we are going to do it) shows that editing a list's item is not a behaviour of the list itself but of the item of the list. So for an example you could have a list with alot of items each of them would have a different set of attributes to be edited (for example a list with an item apple,an item pie, an item car, and so on). If on one side this list is generic and very flexible, on the other hand it may turn out to be uncontrollable, and thus unmanage-able (did I write that correctly?).
The other way is actually to make the list item-specific by incorporating the item's editing into the list itself, and in this case you would have a list which could handle only one single type of items (the ones whose content is editable by the list itself). Now this kind of list is not flexible at all, but at least you know what you are working with.

c) Associating list items to groups - or, creating indexes, or, even better, creating a new list! out of other lists.

In my opinion lists are indexes. So we will call lists indexes from now on and forever.

Following this point of view, it turns out that the functionality we are requiring is based actually on three conceptual indexes :

a Picture index
a Category index
a Group index

Let's assume we say a picture has to belong to a certain category - I don't see the point in having a picture in the system otherwise anyway. This makes things even more easier. Why? Becasue in this way, indexes 2 and 3 melt together, and we get rid of one of them. Yeah! The result is now:

a Picture index
a Group / Category index

So what we have now, are two indexes. How are they related to each other - how do they work together? Well, we know for sure that we need to manage pictures. And we just said pictures are required to be belonging to at least one group / category. And one group / category can have more than one picture. So there we end up in having a two - level index, whereas on top is the Group / Category index and on the bottom we have the single Group / Category pictures relative indexes. So far, so good. The rest, tomorrow :).

Tuesday, October 14, 2008

Of Man, Snakes and Cosmologic Balls

Or, in the correct form, "how to get nosetests to do what you want".

Yes well, yesterday that was like rushed. Programs changed on the fly and I had to switch to system testing because of the release that is planned for today.
BUT! Now I am back on nosetests, did you miss me? I really hope you did not!

So, where did we left off? Oh yes, we found out that nosetests and nosexunit are actually correctly installed and recognized.

Just to recap, the problem is relative to nosetest's output in xml format (which is supplied via the nosexunit plugin).
For instance, a command line which produces that kind of behaviour looks like that :

nosetests --sysloghost=localhost --with-nosexunit --xml-report-folder=report testunit

where

localhost is the output folder for the nosexunit reports
testunit is the unit which nosetests will search for test* routines
report is the ip address (or name) of the syslog daemon host

So now the questions are as follows:

how many files does nosexunit produce
in what xml structure

At the moment I can't answer any of those, because nosexunit doesn't actually get me any output. And this seems due to the fact that nosetest "traces" the output only for test suites, not for single test cases. Clearly this is not the case, we have other test cases which work flawlessy and produce an excellent XML - formatted output thanks to nosexunit.
So what's wrong? Hard to tell. Luckily NoseXUnit comes with full source code, and I have some spare time left. So let's have a look.

Monday, October 13, 2008

Of Python, Nosetests and Saturn's cyclon, and discovery-based testing

Wow.
Space bloats me away. Saturn and neptune especially. Mysterious, gigantic, enormous planets. The more we start knowing about them, the more we will love to know. And want. I for myself would love to set a foot on Mars. I mean, can you immagine the feeling?
Check this and you'll understand what im talking about (courtesy of NASA)!

Now, what does this have to do with Python and Nosetests?

Honestly, I doubt that the Fail-safe Shuttle's OS has any parts implemented in Python, but possibly it has been tested with it. The same I am doing today, and this is the live feed on how it's going (so far).

The problem is I can't seem to be able to get any test results output from nosetest in xml format. There may be two reasons for it :

I don't have nosexunit installed (plugin which is needed for it)
There is another reason why it's not working

Something tells me it will be the first one?
So the first step is to learn how to check for the nosexunit installation. Any clue?
How about starting at the nosexunit's homepage first. An easy easy_install nosexunit executed from a shell gives us a good clue about the status of the nosexunit package on the current system - turns out the library is there, damn it.

So now we have to expand option two. Update follows later.

Abandoned CruiseControl

Cruisecontrol is used alot in here (at the office where I work, that is).
The version of CruiseControl that I installed is the latest one : Binary distro 2.7.3.

So it came just natural that I had to deep-dive into it, mainly for understanding strange behaviour or missing behaviour, for an example on the logging side of the application and on configuring the dashboard (which was not easy at all).

The documentation that you can find at CruiseControl's main project page is not complete. Really, it is not. It covers mainly all of the configuration file options and grammar, and that is it.
Absolutely *no* clue about how to configure and setup a dashboard, or hints about how to customize logging, how to integrate and correctly visualize logging into legacy build chains etc etc.
Wow. It's the first time ever that I see a free software project this big with almost no documentation.

Let me tell you what happened when I tried to get the dashboard to work. It should be as simple as adding a few switches on the CC command line (in the cruisecontrol.bat file), but it's not, because quite a few basic jar modules are missing. So what did I do. I enabled CC debugging with the -debug switch and parsed to 5 Megs logs at a time for almost three days. Not the easiest of tasks but my time is paid anyway, so who cares. It was worth it, I managed to get all the missing jars. But I lost two day's time to look after pieces of SW that should had been there.

Well, you would say that that is it. Wait, there is alot more to come. Another slightly unimportant feature as logging has been kept out of the scope of the main documentation.
For an example, I assume you all know that Cruisecontrol uses Ant internally. Did you also know that Cruisecontrol logs all the messages posted by Ant on the stderr as "WARNING", the ones posted on the stdout as "INFO" and that errors that occour within the ant build (that is, exceptions) are traced as "ERRORS"? Let's see how many of you raise the hands.
It took some time to discover that, luckily on Nabble there are plenty of extremely usefull forums.

Thanks to those forums I managed to get the dashboard up and running, as well as understanding finally the way Cruisecontrol handles log files.

For an example : suppose you have a legacy tool chain to build your deployables which bases it's functionality on python scripts. Python scripts can be executed by ant through an exec command, which will spawn (create in nerdish, that is) a new process for the python interpreter.
Now, considering that probably CC uses ant to parse it's own configuration as well there are two ways to spawn a process :

Through an Ant Script
Directly from CruiseControl's configuration

The difference between the two calls is mainly in the way the call is handled : the spawned process ("task" in CruiseControl jargon) will be of different nature to Cruisecontrol depending on whom the call has been delegated to, thus the logging for it will be handled differently. It may even happen that it gets totally by the reporting application just because the task that had to be logged is unknown to cruisecontrol (which is *always* the case by the way).

Friday, October 10, 2008

Closets & Dropped Surface Code

I just realized..This is actually what it's named after: a closet. In a closet you throw things that you don't need anymore. Or that you don't need that much right now, but you want to keep it somewhere, because you feel one day you may see a use for it.
Just imagine :
"Where did you put my hammer honey?"
"In the closet, baby".

Exactly, that's the way to go.

So like all the closets I'm gonna throw in all those small things that I get in my hands and just need a place to lay down. But then, this is not a closet but a BLOG. You know what blog stands for - do you. Binary LOG? I think that is the correct answer. Google it if you don't feel like it.

There are two things that mind me right now :

How do I keep my closet tighted up - as in, how do I keep it clean and organized (for those things that may evolve in future from the crap I throw in here)
How do I organize my posting / closet trashing?

The first one is pertinent if I throw in concepts that show evolution and might in fact turn out to be intresting.
The second sounds alot similar to the first one but refers actually to my own action of "doing" (as in "logging") the information. With what frequence? Any standard format to make it more surfable? Any idea on fast-composing it so I don't waste too much time on that thing in that particular day?

Like, for an example, the concept of "bidimensional code" that was busted through my mind and robbed me of my sleep some evenings ago. Topic which I suppose will label my BLOG as a n3rd blob, but honestly, who gives a shit about it? If you are reading here, that means you are a nerd already. Accept the reality and try to make something good out of it.

The meanings of "bidimensional code" is pertinent if you think of some code being defined by a sequence of operations (which it always has been), each of it can be associated to a specific graphical form (or image - who's the l00ser that said clipart?).

Well, what's so special about that?
The answer is : I don't know. But I will try to find out. Probably. If it ever happens, that is.
The only usefullness I could really come up with about bidimensional code (as in code's operations associated biunivocally to images) could be the possibility of a PC to recognize "written" code, along with the logical connection that inheritely is buried within the structure that results from the composition of the single operation-images.

This is sick, I know. But what do you expect? It's a closet.

Hello World!

Well, first post ever in my closet..Pretty empty to be honest, but anyway! I hope to make it full soon with alot of usefull (or crappy perhaps) information. First of all, I thought about getting a quick introduction on what will be posted here. But then I realized this would take much more time than I am willing to give to this place. So I camed up with what could be seen as a compromise:

givign the maximum of information with the minimum timing effort.

There. Cool ha?
Now, without starting any long explanations on what this exactly means (and thus avoiding wasting prrresssious time), let me clarify, in here I'm going to post whenever I feel like:

something good deserves to be posted here
something intresting deserves to be posted
some big news related to my (small) world requires to be spammed
I am victim of one of those occasional "pre-sleep" brainstorms, which boost your (my in this case) immagination over the limits, provided I remember what I was brainstormed with the day after when I wake up.

So. A nice short list of crap to start with eh?

But seriously, what I have been thinking to post in here in the next days is a (somehow) detailed
howto for handling Cruisecontrol, as the ones I found are extremely incomplete (as so is the official website). Where I work we base everything on CruiseControl. But that's another story now.

And already, time flies! >_<

Later!

God I hope I remember my blog's address tomorrow... ^^;