Caching Midgard requests
-
Piotr Pokora
Caching Midgard requests
Thu August 28 2008 12:48:28 UTCHi!
I made simple performance tests for something which is known as
midgard_request_config in Midgard2.
Basically it's replacement for $_MIDGARD and core's (not propagated on
PHP level) request_config.
On PHP level it's simple array which holds:
* midgard_host object
* current midgard_page object
* all tree pages strating from root one to current one
* midgard_style object
* argc and argv[]
I know these are fetched on MidCOM level, so request_config gives them
"for free".
Additionally, pages' array is ready to use kind of breadcrumb. Just
iterate over array and do what you need.
In Midgard2 we need to follow these steps to create request_config:
1. Parse url and tokenize it
2. Fetch host record with QB
3. Fetch root page (QB)
4. Try to fetch all pages by their name ( QB)
5. Page is not found, define argc number and create argv array
6. Create request config object
7. Propagate it as PHP one
So, what we need to cache is everything from 1 to 4, all SQL queries.
And initial idea looks like this:
1. Get url and do lookup to find *the same* in cache
2. If not found perform steps uncached (1-4 as above), and if found, get
request config from cache
3. Clone it
4. Propagate as PHP object
Let me know, if you really need to know why we need to clone it.
# TESTS:
## Memory
I thought that in most cases ( as an average) we should have about 4
objects held by midgard_request_config:
1 host, 1 style, 2 pages.
To allocate up to 50MB of memory, we can hold permanently about 5.000
different urls. Of course it might be configurable for sites which have
enough memory and need more speed. But you know the number at least.
## Performance
I did it for 30.000 unique urls with pattern:
www.mysite.com/pageA/pageB/pageC/pageD/argv1/argv2/29999.
I did lookup for first url in cache and the last one. And fetched four
objects using QB.
( Keep in mind that in tests we use microseconds, not miliseconds )
www.mysite.com/pageA/pageB/pageC/pageD/argv1/argv2/29999
Time 0.012 miliseconds
www.mysite.com/pageA/pageB/pageC/pageD/argv1/argv2/1
Time 0.865 miliseconds
Get Objects
Time 4.715 miliseconds ( 0.004715 sec )
When I turn on MySQL cache, fetching objects require about 3.000
miliseconds, so number is still high comparing to cache lookups. So in
the slowest part of the cache we fetch requests 4x faster.
# Issues
Problem is what to cache exactly. Request or page? In first case we need
to hold cache entries for the same page many times in cases when page
uses many argv. In latter one, we can tokenize url and do cache lookups
as long as we find page's url ( without argv ) in cache, but this may be
slower than SQL queries if page's cache entry is not at the beginning of
the cache and uses many argv.
What do you think?
Piotras
_______________________________________________
dev mailing list
dev@lists.midgard-project.org
http://lists.midgard-project.org/mailman/listinfo/dev -
Re: [midgard-dev] Caching Midgard requests
Fri August 29 2008 08:49:35 UTCHi,
As this all started from the benchmarks I ran after hearing Rasmus
Lerdorf's talk (http://www.sitepoint.com/blogs/2008/08/29/rasmus-lerdorf-php-frameworks-think-again/),
I guess I should answer :-)
On Thu, Aug 28, 2008 at 3:48 PM, Piotr Pokora <piotrek.pokora@gmail.com> wrote:
> I know these are fetched on MidCOM level, so request_config gives them
> "for free".
Yes. When MidCOM3 is run with Midgard 1.x, it fetches all this from
database, but the Midgard 2 dispatcher of MidCOM3 gets them from
midgard_request_config.
http://github.com/bergie/midcom/tree/master/midcom_core/services/dispatcher/midgard2.php
> 1. Get url and do lookup to find *the same* in cache
> 2. If not found perform steps uncached (1-4 as above), and if found, get
> request config from cache
> 3. Clone it
> 4. Propagate as PHP object
I imagine this will make big improvements.
Before I can test this on my benchmarks, it would be interesting to
hear differences with "siege -c 5 -t 30s" on cached vs uncached
midgard_request_config.
As my benchmarks showed, Midgard2 + MidCOM3 performs stellarly on 10
second sieges, but the performance drops at 30 sec runs. As I believe
MySQL connection clogging is the main cause here, caching the request
data should make a huge difference.
> Problem is what to cache exactly. Request or page?
What I would do is have two caches:
* URL-to-page mapping cache
* Page-to-midgard_request_config cache
First you match a given URL to its page, then get the page's
midgard_request_config from cache, change dynamic ARGs as needed
(based on difference of URL and page URL), and then pass it on to
PHP...
That way midgard_request_config wouldn't need to be stored for each
URL, but only for each page. It is quite a big difference, as I
believe a typical MidCOM site can easily have thousands of URLs
(articles, their different variants, whatever), but only a few dozen
pages.
> Piotras
/Bergie
--
Henri Bergius
Motorcycle Adventures and Free Software
http://bergie.iki.fi/
Skype: henribergius
Jabber: henri.bergius@gmail.com
Jaiku: http://bergie.jaiku.com/
_______________________________________________
dev mailing list
dev@lists.midgard-project.org
http://lists.midgard-project.org/mailman/listinfo/dev -
Re: [midgard-dev] Caching Midgard requests
Fri August 29 2008 09:22:45 UTCHenri Bergius writes:
> Hi,
Hi!
> Before I can test this on my benchmarks, it would be interesting to
> hear differences with "siege -c 5 -t 30s" on cached vs uncached
> midgard_request_config.
All my tests were made with command line simple program, which just
measured time.
MySQL server has no other external or internall connections, so in real
life the time I posted should be much longer for getting objects from DB.
I started simple tests with self designed structures as those used in
tests were implemented with GLib API.
Task is quite hard as I decided we need to limit slower case to 2
miliseconds ( 2000 microseconds).
During this time I need to find url/page, resort cache and recreate it.
Simply we need to have full control
over "queued" cache entries.
Looks like it's possible for up to 20.000 cache entries.
Another thing is that implementation should be tied to midgard-php
extension, not to the core.
At least for now, as making core API for this will require special
design for derived hooks.
>> Problem is what to cache exactly. Request or page?
>
> What I would do is have two caches:
>
> * URL-to-page mapping cache
> * Page-to-midgard_request_config cache
That should be also doable. Of course, real cases will show where's the
cache limit.
Piotras
_______________________________________________
dev mailing list
dev@lists.midgard-project.org
http://lists.midgard-project.org/mailman/listinfo/dev
