Monday, 18 December 2006

Know Thy Language: C++ References

As a C++ programmer I find that a lot of people moving over from plain C to C++, as well as people beginning to learn C++ seem to learn the vast majority of differences but for some reason miss references.

I think that many people read the section of their C++ book on references that says they are a bit like pointers, then decide that they have got on with pointers fine in the past and according to the book there isn't much difference so they won't bother going any further with them. It's such a shame really because references are so much more than pointers. In some ways they can even be used to promote safer code and design rules for software.

In this article I will be going through the list below explaining references and there relationship with pointers, potential pitfalls and use as a design tool.

  • Relationship with pointers
  • Things to remember when using references
  • References as a design tool
  • Advantages of references


Relationship with pointers

A pointer can be used in two ways, it can be used to move around in memory changing the location it is pointing to using pointer arithmetic or it can be used to access the actual data at the memory address (de-referencing). References however can only be used like de-referenced pointers, they are assigned an area of memory on creation and that's the only location they are allowed to point to. Once the assignment has taken place the reference will behave as if it is the data it is pointing to.


A reference can never be null itself like a pointer can, however the area of memory it points to could be invalid. How exactly this happens and how to avoid it are explained in the next section.


Things to remember when using references

References should be used when you can guarantee the life time of the object the reference is pointing to. If you can do this then the reference should never become invalid. The major reasons for references becoming invalid are for them to be assigned to memory that has been newed and the life time of that memory can not be guaranteed. When ever you assign anything to a reference you are making a contract with yourself and any code that uses your code that what ever the reference is pointing to will be in existence for the duration of its use.

Quite often if a crash occurs inside a function that uses references as its parameters its usually the fault of the function that called it not checking to make sure its data is valid. If the variables being passed in are allocated on the stack then this should never be a problem as they have guaranteed life time, however if it is passed data from the heap it could be null. As a general rule of thumb a null pointer should never be de-referenced and assigned to anything, this is very dangerous and can cause all kinds of crashes. It is also about the only way you would cause a reference to point to invalid memory. Threading can also cause an issue if the memory the reference was allocated to gets deleted by one thread while it's in use in another. However if situations like this occur it's more than likely that there is a more deep routed problem with the code as situations like that shouldn't occur with proper locking in place.

As a general rule, if the object you are passing around has a guaranteed life time then use a reference, otherwise use a pointer.


References as a design tool

References are a very powerful design tool. If a function you have made returns a reference then other programmers will know that they shouldn't have to worry about the contents of that reference all of a sudden disappearing. They will also know that the function will always return them some form of valid data, as a reference can not be null. If you use references as parameters to a function it lets the programmer using it know that they have to take responsibility to make sure that the data passed in is valid, and will stay valid for the duration of the function call.

References can be used to add more safety to code as well. For example you can pass an array of objects to a function via a pointer and you don't have to specify how big the array is. This could cause buffer overflows if you try and go too far through the array. References won't let you do this. If you want a function to take an array as a reference you have to specify the size of the array in the function declaration. This has the disadvantage of having to know exactly how big the array will be when writing the code, but it has the advantage of allowing the compiler to tell you if the array being passed to the function is the wrong size, instead of the program crashing.


Advantages of references

References share the majority of the same advantages as pointers. They are much more efficient to be passes around as parameters to functions especially if the object they point to is large, as references like pointers will not incur the copy overhead that would happen when passing by value. They have the extra advantage over pointers of; if it is used as a parameter it can be used within the function as if it was passed by value. This usually means that code can be kept cleaner and easier to read, as sometimes lots of pointer notation and de-referencing can look messy.



While this has been a quick overview of the features of references as well as some of the things to look out for when using them I hope it encourages people who have been putting off using them for one reason or another to start using them now. I will probably post some more short articles on using references in certain situations in the future. All comments are welcome.

Thursday, 14 December 2006

Why innerHTML can ruin your AJAX applications

The reason I am writing this article is because of some work I have been doing recently. My client wants a persistent AJAX application to work in IE6+. This means the web page will be left running for days/weeks on end without being refreshed or the browser being closed. This means it cannot have ANY memory leaks.

I currently process incoming XML using an XSL document - which works very well. However when done under IE6 the XSL transformation is output as plain text [please let me know if there is a DOM alternative] which means the only way to insert my newly transformed code is to use innerHTML. This seems to leak memory like a biatch! Note that the leak doesn't occur in FireFox.

So why does innerHTML leak? Well after some digging I found out a few interesting facts - lets look at the who, where, why and how...

Where did innerHTML come from?
innerHTML is not a standard, it is a proprietary format by Microsoft. Given Microsoft's huge installed base and popularity other browsers included innerHTML for the sake of compatibility. It was never in the w3c specification. It subsequently isn't future proof and isn't designed for xhtml [xml mime type].

Why is innerHTML used?
innerHTML is used as a cheap and easy way to write objects to a container element like a span or DIV. It tends to be faster than traditional DOM methods and takes substantially less time to write. For example you can quickly write a table structure into a DIV container using innerHTML like so:

var myDiv = document.getElementById('myDivId');
myDiv.innerHTML = "<table>
<tr><th>Title</th></tr>
<tr><td>Mr</td></tr>
</table>";

The equivalent code using DOM methods would take up considerably more time and lines of code to accomplish. So for lazy coders or new guys innerHTML provides a great way into the world of JavaScript. To be fair to innerHTML, it works very well for the most part, yes its ugly, its not "right" but if your in a rush and don't have the requirements my client has then it can work very well. You should try, where possible, to avoid it though as it can have unpredictable results and its a bad habit to rely on innerHTML to change or update your web page. DOM methods, in the long run, will serve you better.

So why / how does innerHTML leak memory?
The specific answer to this I do not know but we can theorize a solution. Firstly, as previously discussed, innerHTML is not designed to work with xhtml documents, meaning that it can incorrectly interpret the string you pass to it. This means it doesn't actually render exactly what you ask it to - basically it adds things you never asked for. If you parse the html later on using DOM methods you may run into errors or bugs where the DOM hierarchy isn't as you expected.

Is there a solution?
Yes! Simply don't use innerHTML. I don't think you'll ever find an article, and there are quite a few, that puts in in good light [aside from its obvious benefits]. However if you are in my position and are forced to use it then, apparently, there are a few things you can try doing.

Firstly as the memory leak may be coming from its incompatibility with xhtml it is well worth re-writing your code to accommodate this, or by re-structuring your xhtml in a different way. Unfortunately it's quite hard to know exactly what innerHTML is doing so its a bit of guess work.

If you wish to use DOM methods to replace your existing innerHTML code then please look at this very good article on innerHTML alternatives.

As I said at the start, if anyone reading this is thinking "what a muppet I know the answer" then please leave a comment as I'm sure it would help me, and anyone reading this!

Tuesday, 12 December 2006

How to resize pictures in PHP

The web nowadays requires a level of interactivity never seen before. People love commenting and building communities, and a huge part of that is putting a face to your online presence. To this effect there is a big demand for image processing techniques using web based languages. My web language of choice is PHP and so here is a short walkthrough, including code, on how to create a picture resizing function.

This function can both keep the aspect ratio of the picture, or force the picture to fit the new dimensions. It is designed for jpeg pictures but can easily be extended to cater for gif and png.

The first thing you need to do is create your form in your web page. This basically requires you to set a flag in the form tag like so:

<form name="myForm" action="uploadpic.php" method="post" enctype="multipart/form-data" />

The enctype flag basically tells the browser to break the form objects down and send in chunks. This is transparent to the user, and to the PHP script receiving the data. But is useful to know.

You will then need a form element with type "File" which will give you the typical input box with a browse button to allow your users to search for the file they want to upload.

<input name="userfile" type="file" size="40" />


Now we need a PHP file to receive the file, we are assuming you have a file called uploadpic.php, within your form handling code you should have something like this to call the function we are about to write:

if (@is_uploaded_file($_FILES["userfile"]["tmp_name"]))
{
$img_name = resizePic("player","userfile","../picupload/",150,150,0,"destination_name.jpg");
}

$_FILES is a server variable containing an array of the uploaded files, note that "userfile" is the name of the form element we specified earlier, the tmp_name is the temporary name given to the object when its uploaded. The first line basically confirms the file was uploaded ok. The second line calls our function resizePic and stores the contents into a variable called $img_name. The return value of the function is discussed later.

OK now onto our function definition, we saw the variables passed to it in the previous section of code, now we can see what each parameter does.

function uploadPic($prefix,$formSrc,$dDir,$maxWidth,$maxHeight,$resizeFlag,$tempName) ;

  • The $prefix parameter defines the prefix for all images uploaded, if left blank no prefix is given. This is so we could have all our images renamed to news_00001.jpg etc.
  • The $formSrc parameter simply passes the name of our form element to the function, so it can find the file we uploaded.
  • $dDir is the destination directory for the image
  • $maxWidth is obviously the maximum width the picture can have
  • $maxHeight performs similarily
  • $resizeFlag defines whether the image should keep its original aspect ratio
  • And finally $tempName defines the name of the image
OK now to flesh out our function. Our first job is copy the temporary image so that we can work on it without fear of it getting deleted as it is currently held in a temporary folder on the server.

if (!copy($_FILES[$formSrc]["tmp_name"],$dDir.$_FILES[$formSrc]["name"])) {
print ("failed to copy $file...
\n");
}

This code will copy the file, with its current file name, to our destination folder.

If no $tempName was given we want to create a random name for the image, we do this simply by concatonating the prefix and a random number like so:

if ($tempName == "") { $tempName = $prefix."_".mt_rand().".jpg"; }


Next we need to create a few variables to store the location and names of our source image, and the eventual location of our finished image.

//hard copy of picture
$src_img = $dDir.$_FILES[$formSrc]["name"];

//destination
$dest = $dDir.$tempName;


We now need to get a little information about our file, confirm its an image, and find out its current dimensions. This is done easily with a function call to one of the PHP GD APIs.

//get source dimentions
$src_dims = getImageSize($src_img);


Now we have our image information we need to then create a temporary image in memory and deal with each image type seperately, for the sake of this tutorial I am simply going to concentrate on jpeg only, the case statements for other types are left in though and it will only take a small amount of work to alter the function to work with all image types.

//create appropriate temp image
switch ($src_dims[2]) {
case 1: //GIF
break;

case 2: //JPEG
$srcImage = imageCreateFromJpeg($src_img);
break;

case 3: //PNG
break;

default:
return false;
break;
}


Next we need to find out if we want to keep the aspect ratio, if so resize using a divisor, else just force the image to the correct dimensions. This is done by finding out how the image dimensions compare to our desired dimensions then dividing or multiplying accordingly.

$srcRatio = $src_dims[0]/$src_dims[1]; // width/height ratio
$destRatio = $maxWidth/$maxHeight;

if ($destRatio > $srcRatio) {
$destSize[1] = $maxHeight;
$destSize[0] = $maxHeight*$srcRatio;
}
else {
$destSize[0] = $maxWidth;
$destSize[1] = $maxWidth/$srcRatio;
}

//if set image dimensions are required:
if ($resizeFlag == 1) {
$destSize[0] = $maxWidth;
$destSize[1] = $maxHeight;

}

As you can see we now have an array called $destSize which contains the final height and width of our image. All we have to do now is create a new image in memory to the correct dimensions, then copy our source image into that new placeholder. We will use a function called imageCopyResampled to achieve this which usually gives the best results when resizing.

$thumb_w = $destSize[0];
$thumb_h = $destSize[1];
$dst_img = imageCreateTrueColor($thumb_w,$thumb_h);
imageCopyResampled($dst_img,$srcImage,0,0,0,0,$thumb_w,$thumb_h,$src_dims[0],$src_dims[1]);

Our image is almost done, all we need to do now is save our image to the destination folder and delete any temporary images we created along the way. We then return with our final image name so that the calling function knows the exact name of the image.

switch ($src_dims[2]) {
case 1:
break;
case 2:
imageJpeg($dst_img, $dest, 75); //75 denotes image quality / compression ratio
break;

case 3:
break;
}
//$y++;
unlink($src_img);

return $tempName;


And that is it, you now have a function capable of resizing uploaded images, if you are a basic user this will suit your needs perfectly well. However if you will be doing a vast number of resizes then this code may be slightly slow for your needs and it may well be worth looking into other resize methods that give lower accuracy.

On a final note with regards to performance it is generally much better to resize on upload rather than dynamically when a page is generated. It is a trade off between space and processing time, but if your site is popular then dynamically create each image on the fly can cost a lot in processing time and make your site slow.

Monday, 4 December 2006

Klingon Programmer Humor

I found this article too amusing to pass up, its a list of the Top 20 things likely to be overheard if you had a Klingon Programmer and is very amusing, here are a couple of my favourites:

  • What is this talk of 'release'? Klingons do not make software 'releases'. Our software
    'escapes' leaving a bloody trail of designers and quality assurance people in its
    wake.
  • Klingon function calls do not have 'parameters' - they have 'arguments' -- and they ALWAYS WIN THEM.
  • Debugging? Klingons do not debug. Our software does not coddle the weak. Bugs are
    good for building character in the user.
Quality stuff.
Hope you enjoy

Saturday, 2 December 2006

Who is Coding Grasshopper?

Well beings as m0nkeymafia has introduced himself I thought I better do the same.

I work for the same company as m0nkeymafia doing a whole range of different development type work. I mostly program in C++ on Windows of varying types, as well as currently some embedded stuff on ARM processors.

As well as the usual Windows NT based stuff I also do development for CE. My main job on CE has been getting a platform up and running which basically involves many painful hours battling with platform builder and different quirks of hardware and software.

I've also done development on Linux. I spent a reasonable amount of time getting a platform up and running on a Blackfin processor based board.

There's plenty of other stuff I'm into as well, which hopefully will come across as I write articles for this blog.

Congratulations m0nkeymafia

Just like to put out some congratulations to m0nkeymafia for getting one of his "pet projects" featured in official xbox magazine as, and I quote "The ultimate PES site". PES for all that don't know is Pro Evo Soccer.

So well done m0nkeymafia. Keep up the good work, and I can't wait to see what arrives next from the m0nkeymafia house of web.