Setting up Nginx and CakePHP 2.0

Nginx is a pretty awesome web server (fast, and easy to configure… at least I prefer the syntax over some other popular web servers).

I figured to share the installation process of both CakePHP 2.0 and Nginx on Ubuntu 11.04 (Natty).

Let’s fire up the terminal…

(I presume you have some basic knowledge of *nix so I won’t go into details about the commands, etc.)

sudo apt-get install git

Next, let’s get a fresh version of cake 2:

cd /cake
sudo git clone https://github.com/cakephp/cakephp.git
sudo git checkout 2.0
sudo git pull

Git should tell us that we are “up-to-date”.

Alright, now we have to setup our environment.
The best way I found of going about it (after trying quite a few methods out there) is by the running the excellent set of scripts, which you can find here.
Pull the scripts from the git repo, in the similar way as shown above, into some reasonable local destination. Once you have the files locally, simply run:

sudo ./install.sh

After answering a few questions… our LEMP environment is ready!

Let’s see if Nginx is working as expected…
In the browser head over to localhost, at least at the time of this writing, you get a phpinfo() page served up by default.

So we are satisfied that Nginx is serving up PHP and now it’s time to setup a CakePHP 2.0 app.
When we’ve pulled CakePHP in the very beginning, it came with a skeleton app, which we will use for growing our new one.

Let’s copy it someplace easily accessible (assuming we are in the root of cake… you should see “app”, “vendors”, “lib” directories):

sudo cp -r app /web/

In order not to move cake anywhere, we’ll create a symbolic link to our lib.
(Presuming we are now in the “/web” directory):

sudo ln -s /cake/cakephp/lib lib

So at this point we have php, web server, cake core, skeleton app, mysql all ready to go.
The only remaining part is to tell the web server about our cake app. Similarly to Apache we can setup virtual hosts and rewrite rules in Nginx. I don’t know many details about setting up Nginx and all the rewrite rule tricks available, but as mentioned before, from the examples the syntax looks quite simple and one should be able to decipher the directives relatively easily.

… Being lazy and not wanting to go through the docs, I googled around and thanks to this post it was quite fast to setup a virtual host for the app.

After the installation Nginx will have a setting file in:
/etc/nginx/sites-available/default
(This particular set of installation steps is applicable to Ubuntu, but hopefully you’ll know how to achieve the same procedure in your own OS).

If we review the file quickly it seems like a solid starting point, but we need to have some rewrite rules for cake to make pretty-urls work.
Well, once again thanks to aforementioned post, all we have to do is add the snippet below to our default config (hey, at least it’s working for me):

# rewrite rules for cakephp
  location / {
    root   /web/app/webroot;
    index  index.php index.html;
    try_files $uri $uri/ index.php;

    # If the file exists as a static file serve it
    # directly without running all
    # the other rewite tests on it
    if (-f $request_filename) {
      break;
    }
    if (!-f $request_filename) {
      rewrite ^/(.+)$ /index.php?url=$1 last;
      break;
    }
 }

The root setting is pointing to our app’s webroot, of course… which in trun becomes the root of the virtual host (let’s just use localhost for now, otherwise you’d need to take a few additional steps, but that’s beyond the scope of this post).
Hopefully the code of the setting is not too hard to figure out.

So at this point we have a virtual host pointing to our app and all the settings in place, let’s restart the web server:

sudo /etc/init.d/nginx restart

… and if all goes well, once you visit localhost in your browser, you should see the CakePHP 2.0 welcome page.

Quick comparison of Nginx and Apache

This was a quick test as I was playing around with Nginx and CakePHP 2.0.

The numbers were interesting, however.
What I did:
- Setup a virtual box with Windows host
- OS: Ubuntu (Natty)
- PHP 5.3.8
- CakePHP 2.0-beta (freshly pulled)
- apache2 (2.2.17)
- nginx (1.0.5)

Nothing was tweaked or tuned. I’ve setup both servers to use virtual hosts and simply load the default CakePHP page (i.e. fresh install).
There is no app behind any of this, but we are touching pieces of the framework and some PHP logic.
(Comparison is about the web servers anyway)…

Anyway, start apache and run:

ab -kc 10 -t 30 http://localhost/

So we’ll use apache benchmark to beat the localhost a little (for 30 seconds) and get some numbers:

Benchmarking localhost (be patient)
Finished 839 requests

Server Software:        Apache/2.2.17
Server Hostname:        localhost
Server Port:            80

Document Path:          /
Document Length:        4481 bytes

Concurrency Level:      10
Time taken for tests:   30.009 seconds
Complete requests:      839
Failed requests:        0
Write errors:           0
Keep-Alive requests:    839
Total transferred:      4109432 bytes
HTML transferred:       3759559 bytes
Requests per second:    27.96 [#/sec] (mean)
Time per request:       357.679 [ms] (mean)
Time per request:       35.768 [ms] (mean, across all concurrent requests)
Transfer rate:          133.73 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.7      0       7
Processing:   203  355 121.7    352    2931
Waiting:      202  355 121.7    352    2931
Total:        203  355 122.0    352    2938

Percentage of the requests served within a certain time (ms)
  50%    352
  66%    360
  75%    365
  80%    368
  90%    377
  95%    385
  98%    393
  99%    404
 100%   2938 (longest request)

Now, shutdown apache, start nginx and repeat the above test:

Benchmarking localhost (be patient)
Finished 4451 requests

Server Software:        nginx/1.0.5
Server Hostname:        localhost
Server Port:            80

Document Path:          /
Document Length:        4481 bytes

Concurrency Level:      10
Time taken for tests:   30.001 seconds
Complete requests:      4451
Failed requests:        0
Write errors:           0
Keep-Alive requests:    0
Total transferred:      21367368 bytes
HTML transferred:       19972014 bytes
Requests per second:    148.36 [#/sec] (mean)
Time per request:       67.403 [ms] (mean)
Time per request:       6.740 [ms] (mean, across all concurrent requests)
Transfer rate:          695.53 [Kbytes/sec] received

Connection Times (ms)
              min  mean[+/-sd] median   max
Connect:        0    0   0.3      0       7
Processing:    18   67  11.6     67     204
Waiting:        5   40  22.1     42     198
Total:         18   67  11.6     67     204

Percentage of the requests served within a certain time (ms)
  50%     67
  66%     71
  75%     73
  80%     75
  90%     80
  95%     85
  98%     93
  99%    102
 100%    204 (longest request)

OK, let’s see:
Total requests served: Apache – 839, Nginx – 4451
Requests per second: Apache – 27.96, Nginx – 148.36

The other numbers are quite unbelievable as well.

What’s the point of all this? Tutorial on CakePHP 2.0 + Nginx is coming soon here ;)

Under the hood of CakePHP 2.0

Thanks to excellent Mr. jrbasso for putting together this list for me.
In case one wonders, yes he does know a few things about cake ;)

So, without further ado here’s a couple of things to enjoy in CakePHP 2.0…

__() now works like sprintf()

There were a few complaints about this in the past, as well as the fact that __() used to echo by default.

Now the problem is fixed and the default usage is as follows:

echo __("some %s var", $myVar);

Improved file structure and lazy loading of files

The naming convention is much better and simpler now. No more messing about with underscore’s, etc.
Whatever you have for your class name is what you have for your file name (+”.php”).
Misspelling of model file names ESPeciaALLY (e_s_pecia_a_l_l_y.php) in extreme cases was a common mistake for cake not being able to find the model.
Therefore none of the model specific rules for validation, methods and other logic could be executed.
(The simple debugging and good ol’ copy/paste should be much easier now).

New CakeRequest and CakeResponse

First, CakeRequest gathers all information about the request (in a way like $this->params does, but with a lot more juice).
For example, some responsibilities of the RequestHandler component have been shifted to CakeRequest.
Mark Story has an excellent write-up about this if you wish to learn about
the details.

The counterpart, of sorts, to CakeRequest is CakeResponse… which, as you’ve guessed, works to handle responding to requests.
It consolidates the work, which was previously spread across various components of the system.
Again, I will refer you to Mark Story’s blog to get detailed description about CakeResponse.

The whole idea is to decouple and better organize related tasks from various places in the framework. Good organization helps with maintainability of the framework code, and, in turn, your own.

CakeEmail is a Library now

This change should stop 99% of the questions about how to send email from the model.
Yaayy! You don’t have to break MVC anymore (in your User model):

App::uses('CakeEmail', 'Network/Email');
class User extends AppModel {
  public function afterSave($created) {
    if ($created) {
      $email = new CakeEmail();
      $email->from('me@example.com')->to('new.user@example.com')->subject('Welcome')->send('Hello! This is my message to the new user.');
    }
  }
}

You don’t have to chain the methods as above…

HTML5 methods in Form/Html helper

Even though HTML5 is still a bit infant more and more web applications are beginning to apply its elements. This is especially true in the mobile market.
CakePHP 2.0 has a clever way for the HTML5 implementation, by using the magic __call() method,
to create simple inputs.

Let’s look at the test case to better understand this:

/**
 * test that some html5 inputs + FormHelper::__call() work
 *
 * @return void
 */
	function testHtml5Inputs() {
		$result = $this->Form->email('User.email');
		$expected = array(
			'input' => array('type' => 'email', 'name' => 'data[User][email]', 'id' => 'UserEmail')
		);
		$this->assertTags($result, $expected);

		$result = $this->Form->search('User.query');
		$expected = array(
			'input' => array('type' => 'search', 'name' => 'data[User][query]', 'id' => 'UserQuery')
		);
		$this->assertTags($result, $expected);

		$result = $this->Form->search('User.query', array('value' => 'test'));
		$expected = array(
			'input' => array('type' => 'search', 'name' => 'data[User][query]', 'id' => 'UserQuery', 'value' => 'test')
		);
		$this->assertTags($result, $expected);
	}

PDO and database access performance

In 2.0 CakePHP is switching to the PDO. PDO drivers are stable, native and well-supported. They obviously provide faster and better data access.

Auth is more flexible and supports different authentication methods like “Digest”

Authentication and Authorization are now properly decoupled from one another.
In both cases an app developer has the ability to extend BaseAuthenticate or BaseAuthorize to add new authentication and authorization methods.
CakePHP 2.0 comes with a few core methods (at least at the time of writing this):

  • ActionsAuthorize — Provides the ability to authorize using the AclComponent
  • BasicAuthenticate — Provides Basic HTTP authentication support for AuthComponent
  • ControllerAuthorize — Provides the ability to authorize using a controller callback
  • CrudAuthorize — best explanation is taken from the code doc:

    For example, taking `/posts/index` as the current request. The default mapping for `index`, is a `read` permission check. The Acl check would then be for the `posts` controller with the `read` permission. This allows you to create permission systems that focus more on what is being done to resources, rather than the specific actions being visited.

  • DigestAuthenticate — Provides Digest HTTP authentication support
  • FormAuthenticate — Authenticates the identity contained in a request (i.e. the good ol’ login form)

So much cleaner :)

No more PHP4

Nuff’ said. This makes Cake 2.0 much faster (oh let’s say twice as fast), and no I don’t have evidence to support this claim, by removing a lot of uncessary code to support PHP 4 and 5, not to mention additional logic which would lead on one path or another depending what version of PHP you have installed.
To be more specific, CakePHP 2.0 will support PHP 5.2+ (so it may not be utilizing some of the newer features of PHP 5.3).

Using collections to load Helpers, Behaviors, Components

From CakePHP lighthouse page:

Helpers, behaviors, components, and tasks were restructured for 2.0. After examining the various things these objects did, there were some striking similarities. All the object types except Tasks provided callbacks and custom methods. However, the loading and usage of callbacks was slightly different in each case. For 2.0 these different loading/callback triggering API’s were simplified and made uniform. Using BehaviorCollection as the base of how things should work. Each object type now has a Collection object. This collection object is responsible for loading, unloading and triggering callbacks.

After examining the responsibilities of each class involved in the View layer, it became clear that View was handling much more than a single task. The responsibility of creating helpers, is not central to what View does, and was moved into HelperCollection. HelperCollection is responsible for loading and constructing helpers, as well as triggering callbacks on helpers. By default View creates a HelperCollection in its constructor, and uses it for subsequent operations. The HelperCollection for a view can be found at $this->Helpers.

Components were refactored in 2.0 to solve a number of inconsistencies and provide a more uniform API. In the past Component was the loader and manager of Components for a Controller. In 2.0 ComponentCollection takes over that responsibility and Component is now a base class for components. This unifies the API between Helpers and Components as a collection.
Inside a controller $this->Component has been renamed to $this->Components this makes it more uniform with Behaviors and Helpers.

Custom class names (aliasing) for your Helpers, Behaviors, Components

pulic $helpers = array(
    "Html" => array(
        "className" => "CustomHtml"
    )
);

Plugins are not auto-loaded

You will need to load them in your class or bootstrap.

Models are now lazy loaded

With all this talk about Lazy Loading, this is probably the one that deserves a lot of attention.
Besides, models will not attempt a DB connection until find() (or other DB-actionable) method is called.

p.s. CakePHP 2.0 could be here sooner than people expect. I have a feeling we’ll have RC4 come CakeFest. So, if you have not started thinking about migrating your app to 2.0, now is a good time ;)

Setup debugging for Netbeans + CakePHP

Update (7/22/2011): dogmatic69 pointed out that you can do the same with Chrome by installing the xdebug extension. See his comment for details.

For all the Netbeans users out there, if you don’t have debugging enabled, this little “how-to” should get you stared pretty easily.

First prerequisite is to make sure you have xdebug installed and enabled for PHP.

In the php.ini you should have the following settings:

xdebug.remote_enable=1
xdebug.remote_mode="req"
xdebug.remote_handler="dbgp"
xdebug.remote_host="localhost" #or try 0.0.0.0
xdebug.remote_port=9000

Make sure that the xdebug extension is enabled, of course.
(Under the [XDebug] section of php.ini, zend_extension=[path to your php_xdebug lib]).
Don’t forget to restart your web server ;)

Next we’ll install the Netbeans Fire Fox add-on.
It can be downloaded from here: https://addons.mozilla.org/en-US/firefox/addon/easy-xdebug/
As you can guess this means that FF should be your default browser, because it will need to open up once you start running the debugger.

Let’s switch to Netbeans, right click on the app of interest and navigate to “Properties”.
In the project properties window select “Run configuration”.
Project URL: http://yourhost.example.local/ (the local host name from which your app is running).
Index File: index.php (easy enough).

Double check some options in Netbeans…
Tools -> Options -> PHP tab
“PHP Interpreter” should point to the correct location of your PHP executable.
“Debugger Port” should be the same as the setting in your php.ini (xdebug.remote_port=9000)
You might want to “uncheck” the “Stop at First Line” box.

This should be it, as far as the setup goes…

You can now run a quick test:
Open app/webroot/index.php.
Add a breakpoint at some line.
(You might want to add xdebug_break(); at the bottom of the file, just to be sure).
Hit Ctrl+F5 (or whatever is the command on your OS to debug the project).

If all goes well FF should open up and you should see the code execution stop at the line where you’ve set the breakpont with some debug info in the Netbeans console.
Congrats!

p.s. Setting up other IDE’s should be similar in the overall approach… but, as always, your mileage may vary.

Offload read queries to a replica DB for better performance

In most web application, which require a lot of find()'s especially if more than a couple of models are involved, you should probably consider offloading those operations to read-only replica of your DB. This is typically achieved by having a master/slave or master/master configuration. In high traffic application you might have a cluster of databases, but for the purpose of this example we’ll only use two data sources: “default” and “replica”.

Therefore our basic database.php will look something like this:

public $default = array(
		'driver' => 'mysql',
		'persistent' => false,
		'host' => 'production.example.com',
		'login' => 'user',
		'password' => 'password',
		'database' => 'production',
		'prefix' => '',
		'encoding' => 'utf8'
	);

	public $replica = array(
		'driver' => 'mysql',
		'persistent' => false,
		'host' => 'readonly.example.com',
		'login' => 'user',
		'password' => 'password',
		'database' => 'readonly',
		'prefix' => '',
		'encoding' => 'utf8'
	);

This prepares our application to use two data sources, as needed.

Next, let’s imagine we have a good ol’ blog and need to grab various information to build a list of posts.
In our Posts Controller, we’ll have some method that gets the required information:

$this->Post->getListofPosts();

The method above will have to involve additional models to get all of the needed info (Author, Tag, (PostsTag for the join table), PostRating… and maybe a few other models). The point is that this is enough operations already to consider offloading them to our read-only DB server.

The actual process is quite simple.
First, we’ll create a generic method in our App Model:

protected function _switchDataSource($models, $datasource = 'default') {
		if (is_array($models)) {
			foreach ($models as $model) {
				ClassRegistry::init($model)->setDataSource($datasource);
			}
		}
	}

I hope this code is simple enough, but the implementation example is coming up…
It’s worth to note that ClassRegistry::init() will cache your model information (object instance, to be more precise) in memory, therefore in a more complex case (where you might have multiple find()'s) the newly switched data source will persist until switched back. Therefore it is important to remember to “reset” your data source once you are done with the read operation(s).

Now, here’s the basic usage sample (this snippet would be inside of our getListofPosts() method):

//let's switch our DS to replica
$this->_switchDataSource(array(
  'Post', 'PostsTag', 'Author', 'PostRating'
), 'replica');

//now we can execute our find with all of the above
//model data coming from the read-only DB
$posts = $this->find('all', array(
  'contain' => array(
  //include our models here
  ),
  'limit' => 35
));

//don't forget to switch the DS back to default
$this->_switchDataSource(array(
  'Post', 'PostsTag', 'Author', 'PostRating'
));

return $posts;

As you see the implementation is pretty simple. The only thing to keep in mind is that if you are getting some SQL errors, chances are you have not included all of the required models for the operation, the most common case is forgetting the join table model. To troubleshoot and see the results the debug kit is very helpful, because it will show you which data source is being used to run a particular set of queries.
Another hint, if you use the same set of models over and over you might as well assign them to a property in your model, so that if you need to add or change something you’d only do it in one place.
Using our example we can modify the code like so:

public $postQueryModels = array('Post', 'PostsTag', 'Author', 'PostRating');

....

$this->_switchDataSource($this->postQueryModels, 'replica');

If you use UUID’s…

Be extra careful to make sure that, according to convention, your ‘id’ column (or primary key) is:

char(36) and never varchar(36)

CakePHP will work with both definitions, however you will be sacrificing about 50% of the performance of your DB (MySQL in particular). This will be most evident in more complicated SELECT’s, which might require some JOIN’s or calculations.

find(‘first’)… gotcha

Just a simple tip…

Here’s a typical find() example:

$this->Post->find('first', array('condition' => array('id' => 5)));

Why does it return me the post with ID = 1, rather than ID = 5?

Of course, a careful reader spotted the spelling issue in the key:
‘condition’ instead of ‘conditions’.

Moral of the story is that find('first') can be a little tricky and misleading and if you are getting an unexpected result be sure to double-check your implementation.

find('first', 'blah');

Installing membase/memcached

These are instructions for getting membase/memcached installed in your local environment.
(I was installing on windows so your setup might be a little different, but general approach is mostly the same).

1. Download the membase server for your OS (community edition is the one you are after):
http://www.couchbase.com/downloads

Once downloaded, the installation is quite simple. Just follow the prompts.
After the server installation is complete, your default browser should open and you will be prompted to continue setting up the “cache bucket”.

Again, it is a simple step by step process. The only thing you’d want to make sure is that you select memcached bucket type.
(This will be 100% compatible with any other default installation).

2. Make sure you have memcache support enabled in your PHP install.
The easiest way to find out is to check the output of phpinfo(). Look for the “memcache” section.

3. What to do if you don’t see the section mentioned above?
First, check your php.ini for: extension=php_memcache.dll (php_memcache.so, on *nix platforms).
In some cases you just might need to uncomment the line above.

Next, you should have an entry for the php/memcached settings, it should look something like this:


[Memcache]
memcache.allow_failover = 1
memcache.max_failover_attempts = 20
memcache.chunk_size = 8192
memcache.default_port = 11211

In my case the php_memcache.dll was missing, so I downloaded the appropriate version from:
http://code.google.com/p/thinkam/downloads/detail?name=php_memcache-cvs-20090703-5.3-VC6-x86.zip&can=2&q=

You’ll have to place the .dll or .so with the rest of your php extensions; if your are using XAMPP it will be somewhere like:
c:\xampp\php\ext

4. Configure cake
Open up app/config/core.php and scroll all the way down to the cache settings.
Presuming you’ve installed everything with all defaults, all you’d have change is:
'engine' => 'File' to 'engine' => 'Memcache'

5. Restart your web server.
You might want to clear your model cache first, to be sure that cake doesn’t fall back on File caching if something isn’t configured properly.
If all goes well, you should see no warnings, nothing should be written to the file system, and in your membase console (Monitor -> Data Buckets -> default) you should begin to see some activity.
(Chart should be updated, as well you should see some records under TOP KEYS).

Congrats, you are now using memcached as your caching mechanism.

Good luck!

Speed up your pagination with a simple hack…

Before I go into the example in this little post, let me just say that this situation won’t be applicable to everyone…

But let’s consider the following:
We have a table with tens of thousands of records, that need to be paginated.

As you know, cake will execute two queries; first to get the count of total records, second to get the actual records.

The questions one might ask are:
“Would any user really go through 5 thousand pages to find what they are looking for?”
“If I bring the last 1,000 records wouldn’t that be enough for a vast majority of needs?”
“How many pages in Google do you go through, when searching for something, before you give up?”
“Is it not better to provide filters, or search tools to help your users narrow down the results to something manageable?”

If you’ve answered “Yes” to two or more questions, please consider the hack…

In your model, which needs to be paginated, do the following:

public function paginateCount($conditions = null,
                                 $recursive = 0,
                                 $extra = array()) {
   return 1000;
}

Yep, we are overriding paginateCount() and simply returning 1,000 because we know that this will be the maximum amount of records that our paginator needs to know about.
Depending on how complex the underlying query is (for example you might have JOIN’s or various conditions, which would usually need to be taken into the account in your typical count query), the above hack can dramatically increase the performance of your pagination.

Dealing with static pages v2 (or… 3?)

Over the years of cake development we’ve seen a number of ways to get rid of the the /pages/ path in the URL for static pages.

By default if you create an “about us” page, such as in app/views/pages/about.ctp, the resulting URL would be:
www.example.com/pages/about

I’m sure you’ve seen a ton of complaints and solutions about how to get rid of this seemingly “annoying” /pages/
To keep the URL’s clean, most people would obviously prefer to have www.example.com/about instead. (No /pages/ in sight).

Recently, my favorite solution has become the following setting in the routes.php:

$staticPages = array(
		'about',
		'legal',
		'policy',
		'something'
);

$staticList = implode('|', $staticPages);

Router::connect('/:static', array(
		'plugin' => false,
		'controller' => 'pages',
		'action' => 'display'), array(
				'static' => $staticList,
				'pass' => array('static')
				)
		);

Go ahead, create your about.ctp and then attempt to access it by going to www.yoursite.com/about

The Router will nicely remove the /pages/ from the URL.

Q.
Why keep the pages in the array() and then use implode()? Couldn’t I just have the following?

$staticList = 'about|legal|policy|something';

A.
You sure could, but array structure allows to keep things organized neater and if you have a large amount of pages it’s easier to scan visually to insert/remove pages as necessary. Either way, you have both options to work with.