Open Source Content Management Framework

Mjönir: storing GUIDs/UUIDs in binary

1 2 3 4 next »
  1. Mjönir: storing GUIDs/UUIDs in binary

    Wed June 24 2009 16:45:07 UTC
    As discussed in linköping: GUID/UUIDs should be stored in binary and
    converted to ascii representation on API layer.

    Pros:

    + Significantly less storage used
    + Smaller indexes
    + Faster searches (I'd guess, since it's not a text search)

    Cons:

    - Not backwards compatible (it doesn't need to be...)
    - Debugging on DB level becomes slightly more painful (but this matters
    not to users or non-core developers...)


    And to those exclaiming that what kind of brainfart is this, sorting
    them in text ?!?!: "historical reasons".

    /Rambo
    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
  2. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Thu June 25 2009 10:05:06 UTC
    Eero af Heurlin writes:

    Hi!

    > As discussed in linköping: GUID/UUIDs should be stored in binary and
    > converted to ascii representation on API layer.
    >
    > Pros:
    >
    > + Significantly less storage used
    > + Smaller indexes
    > + Faster searches (I'd guess, since it's not a text search)

    SQLite doesn't have binary data type.
    There's only BLOB which is considered as is.
    I am not sure if it's text search or not in such case.

    > Cons:
    >
    > - Not backwards compatible (it doesn't need to be...)

    I am afraid it has to be backward compatible in Mjolnir.
    But this couldn't be a problem as we can introduce UUID property with
    GUID in parallel.

    > - Debugging on DB level becomes slightly more painful (but this matters
    > not to users or non-core developers...)

    Debugging is always difficult tusk, so I wouldn't say it's con.
    If needed, we could create some specific tool for debugging.
    But of course, first important thing is if binary will be easily
    portable across db providers.

    Piotras
    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
  3. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Thu June 25 2009 10:45:05 UTC
    Piotr Pokora wrote:
    >> Pros:
    >>
    >> + Significantly less storage used
    >> + Smaller indexes
    >> + Faster searches (I'd guess, since it's not a text search)
    >
    > SQLite doesn't have binary data type.
    > There's only BLOB which is considered as is.
    > I am not sure if it's text search or not in such case.
    >

    BLOB must be a binary search, but it's entirely another question if
    sqlite allows indexing on BLOBs.

    Anyways the installations that use sqlite should not have so large DBs
    that the index would be absolutely required since the raw binary search
    should be fast anyway.

    This is relatively easy to benchmark though: take a large production db
    dump and copy it somewhere, drop all indexes and make new column
    guid_binary and populate with the binary versions then start comparing.

    >> Cons:
    >>
    >> - Not backwards compatible (it doesn't need to be...)
    >
    > I am afraid it has to be backward compatible in Mjolnir.
    > But this couldn't be a problem as we can introduce UUID property with
    > GUID in parallel.
    >

    You mean the mRFC0042 UUIDs ?

    Anyways incompatibility exists only on db level (as I suggested on API
    level hex presentations would be used) and conversion script is trivial.

    Surely there are other changes between mjölnir and vinland that require
    database updates.

    >> - Debugging on DB level becomes slightly more painful (but this matters
    >> not to users or non-core developers...)
    >
    > Debugging is always difficult tusk, so I wouldn't say it's con.
    >

    It's a trivial matter, but con anyway.


    /Rambo
    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
  4. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Sun July 19 2009 15:55:06 UTC
    Made some tests on my Nexenta (GNU/OpenSolaris) midgard zone (not really
    a VM but close).

    dumped repligard table with --opt from my personal installation:

    60M Jul 19 21:50 repligard.sql.orig

    Made a database 'test' and imported the .orig file as the 'repligard'
    table, then made the table with binary guids:

    cp repligard.sql.orig repligard_bin.sql
    sed -i.sed -r -e "s/'([0-9a-f]{32,80})'/UNHEX('\1')/g" repligard_bin.sql
    sed -i -e 's/repligard/repligard_bin/' repligard_bin.sql

    and import the repligard_bin.sql to the test db.

    root@midgard:/var/tmp# ls -lah /var/lib/mysql/test/
    total 125M
    drwx------ 2 mysql mysql 9 Jul 19 22:07 .
    drwxr-xr-x 6 mysql mysql 11 Jul 19 22:01 ..
    -rw-rw---- 1 mysql mysql 65 Jul 19 22:01 db.opt
    -rw-rw---- 1 mysql mysql 35M Jul 19 22:03 repligard.MYD
    -rw-rw---- 1 mysql mysql 39M Jul 19 22:03 repligard.MYI
    -rw-rw---- 1 mysql mysql 8.9K Jul 19 22:03 repligard.frm
    -rw-rw---- 1 mysql mysql 28M Jul 19 22:07 repligard_bin.MYD
    -rw-rw---- 1 mysql mysql 25M Jul 19 22:07 repligard_bin.MYI
    -rw-rw---- 1 mysql mysql 8.9K Jul 19 22:07 repligard_bin.frm

    Here we can significantly smaller data and especially index (371414 rows).


    mysql> SELECT id,realm FROM repligard WHERE guid IN
    ('eaaa5ae56694c00d0ec29cef87108881',
    'c1eb0a70cfda11db95e7c514e61ac918c918',
    'f3c8548acfda11db95e7c514e61ac918c918',
    'b20217eccfdb11db95e7c514e61ac918c918',
    '66a37932cfde11db95e7c514e61ac918c918');
    +-----+-------------------------------+
    | id | realm |
    +-----+-------------------------------+
    | 195 | org_routamc_photostream_photo |
    | 107 | org_routamc_photostream_photo |
    | 74 | org_routamc_photostream_photo |
    | 30 | person |
    | 81 | org_routamc_photostream_photo |
    +-----+-------------------------------+
    5 rows in set (0.00 sec)

    mysql>
    mysql> SELECT id,realm FROM repligard_bin WHERE guid IN
    (UNHEX('eaaa5ae56694c00d0ec29cef87108881'),
    UNHEX('c1eb0a70cfda11db95e7c514e61ac918c918'),
    UNHEX('f3c8548acfda11db95e7c514e61ac918c918'),
    UNHEX('b20217eccfdb11db95e7c514e61ac918c918'),
    UNHEX('66a37932cfde11db95e7c514e61ac918c918'));
    +-----+-------------------------------+
    | id | realm |
    +-----+-------------------------------+
    | 195 | org_routamc_photostream_photo |
    | 107 | org_routamc_photostream_photo |
    | 74 | org_routamc_photostream_photo |
    | 30 | person |
    | 81 | org_routamc_photostream_photo |
    +-----+-------------------------------+
    5 rows in set (0.00 sec)

    Seems my table is not large enough and since the guid is the primary key
    I can't delete the index to compare scanning speed.

    However this completely artificial case is interesting:

    mysql> SELECT id,realm FROM repligard WHERE guid >
    '66a37932cfde11db95e7c514e61ac918c918' AND guid <
    'c1eb0a70cfda11db95e7c514e61ac918c918';

    <snip>

    131975 rows in set (0.38 sec)

    mysql> SELECT id,realm FROM repligard_bin WHERE guid >
    UNHEX('66a37932cfde11db95e7c514e61ac918c918') AND guid <
    UNHEX('c1eb0a70cfda11db95e7c514e61ac918c918');

    <snip>

    131973 rows in set (0.46 sec)

    Interesting as in on what grounds is the text lookup faster than pure
    binary, the resultsets are mostly the same but for some reason the text
    comparison returns two more results.


    Another query set:

    mysql> select hex(guid),realm from repligard_bin where id=153;
    +--------------------------------------+---------------------------------+
    | hex(guid) | realm |
    +--------------------------------------+---------------------------------+
    | 0F5AB82FE6C83AECE4C969470B0A50B8 | element |
    | E7B5E0C2E1A395EC735A33DF4F422EFC | snippet |
    | 0846BED617A584BA828CF1742454B0E1 | record_extension |
    | 731CE1D8B4BDE32E73E2C70406837A91 | page |
    | 5AAF11E51C5D9956011240EB4B2AD470 | eventmember |
    | EE163297585E93555B249E3A96CFD9C5 | member |
    | 0713CB44547977C27BB09C82251A6EF5 | language |
    | ACAA16866EC65E86DC38C27D716585C1 | pageelement |
    | 5C2DE948CFCE11DBAE09CDEA25059E589E58 | midcom_core_privilege |
    | 1664DE8ACFDD11DB95E7C514E61AC918C918 | org_routamc_photostream_photo |
    | 16671B6ECFDD11DB95E7C514E61AC918C918 | org_routamc_photostream_photo_i |
    | 1AB42A2CCFDD11DB95E7C514E61AC918C918 | org_routamc_gallery_photolink |
    | 346607C059FC11DCAF7A7BB29D5E73347334 | midcom_services_at_entry_db |
    | A1BAEF6C1E2B11DEB3F4DF0A8C25EA8EEA8E | midcom_core_login_session |
    +--------------------------------------+---------------------------------+
    14 rows in set (0.25 sec)

    mysql> select guid,realm from repligard where id=153;
    +--------------------------------------+---------------------------------+
    | guid | realm |
    +--------------------------------------+---------------------------------+
    | 0f5ab82fe6c83aece4c969470b0a50b8 | element |
    | e7b5e0c2e1a395ec735a33df4f422efc | snippet |
    | 0846bed617a584ba828cf1742454b0e1 | record_extension |
    | 731ce1d8b4bde32e73e2c70406837a91 | page |
    | 5aaf11e51c5d9956011240eb4b2ad470 | eventmember |
    | ee163297585e93555b249e3a96cfd9c5 | member |
    | 0713cb44547977c27bb09c82251a6ef5 | language |
    | acaa16866ec65e86dc38c27d716585c1 | pageelement |
    | 5c2de948cfce11dbae09cdea25059e589e58 | midcom_core_privilege |
    | 1664de8acfdd11db95e7c514e61ac918c918 | org_routamc_photostream_photo |
    | 16671b6ecfdd11db95e7c514e61ac918c918 | org_routamc_photostream_photo_i |
    | 1ab42a2ccfdd11db95e7c514e61ac918c918 | org_routamc_gallery_photolink |
    | 346607c059fc11dcaf7a7bb29d5e73347334 | midcom_services_at_entry_db |
    | a1baef6c1e2b11deb3f4df0a8c25ea8eea8e | midcom_core_login_session |
    +--------------------------------------+---------------------------------+
    14 rows in set (0.26 sec)

    Here the binary version is slightly faster (don't know if the time spent
    in the HEX function is included), probably because there is only half as
    much data to read from the binary guid.

    The table is too small to affect insert performance:

    mysql> INSERT into repligard set
    guid='a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae',realm='midcom_core_login_session',id=-1;
    Query OK, 1 row affected (0.00 sec)

    mysql> INSERT into repligard_bin set
    guid=UNHEX('a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae'),realm='midcom_core_login_session',id=-1;
    Query OK, 1 row affected (0.00 sec)


    CONCLUSION:

    There is no clear winner from performance POV on this setup and dataset.

    I think it would be worth the time to generate a few million guids and
    do some more tests with larger tables (tepheikk: how many guids in Qaiku
    repligard table ?)

    Since these tables fit easily to memory and my ZFS setup in general is
    very fast (raidz2 across 8 SATA disks, plenty of cache memory) I/O is
    not an issue here, though I have a feeling that if we were to hit disks
    (on inserts for example) the binary version which is much more efficient
    space-wise would be as much faster.

    /Rambo
    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
  5. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Sun July 19 2009 16:05:06 UTC
    Eero af Heurlin wrote:
    >>> - Debugging on DB level becomes slightly more painful (but this matters
    >>> not to users or non-core developers...)
    >> Debugging is always difficult tusk, so I wouldn't say it's con.
    >>
    >
    > It's a trivial matter, but con anyway.
    >

    mainly it means select * is not an option (since you will get raw
    binary, possibly with control character that mess your terminal), so you
    need to specify the columns you want and if you want see guid say
    'HEX(guid)' to get hex presentation.

    But yes, trivial annoyance and doesn't matter at all to anyone but those
    messing about the core.


    /Rambo
    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
  6. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Sun July 19 2009 16:45:06 UTC
    Eero af Heurlin wrote:
    > Made some tests on my Nexenta (GNU/OpenSolaris) midgard zone (not really
    > a VM but close).
    >
    > dumped repligard table with --opt from my personal installation:
    >
    > 60M Jul 19 21:50 repligard.sql.orig
    >
    > Made a database 'test' and imported the .orig file as the 'repligard'
    > table, then made the table with binary guids:
    >
    > cp repligard.sql.orig repligard_bin.sql
    > sed -i.sed -r -e "s/'([0-9a-f]{32,80})'/UNHEX('\1')/g" repligard_bin.sql
    > sed -i -e 's/repligard/repligard_bin/' repligard_bin.sql
    >

    Forgot to mention: manually edited the guid from varchar(80) to
    varbinary(40).

    /Rambo

    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
  7. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Sun July 19 2009 20:12:38 UTC

    Posting here via web interface as well since it seems the mailing-list lurker does not like the attachments in the original mail.


    Tested with repligard table of ~3M rows.

    1. created new test database, created the two SQL files (about 1.1G uncompressed)
    2. time mysql -p<REMOVED> test <large_repligard.sql
      
      real    18m58.036s
      user    0m19.756s
      sys     0m1.495s
      
    3. time mysql -p<REMOVED> test <large_repligard_bin.sql
      real    15m32.668s
      user    0m20.710s
      sys     0m1.533s
      
    4. ls -lah /var/lib/mysql/test/
      total 1.2G
      drwx------ 2 mysql mysql    9 Jul 20 00:01 .
      drwxr-xr-x 6 mysql mysql   11 Jul 19 23:40 ..
      -rw-rw---- 1 mysql mysql   65 Jul 19 23:40 db.opt
      -rw-rw---- 1 mysql mysql 326M Jul 19 23:44 repligard.MYD
      -rw-rw---- 1 mysql mysql 371M Jul 19 23:59 repligard.MYI
      -rw-rw---- 1 mysql mysql 9.7K Jul 19 23:40 repligard.frm
      -rw-rw---- 1 mysql mysql 269M Jul 20 00:04 repligard_bin.MYD
      -rw-rw---- 1 mysql mysql 243M Jul 20 00:17 repligard_bin.MYI
      -rw-rw---- 1 mysql mysql 9.7K Jul 20 00:01 repligard_bin.frm
      
      Significantly smaller index file here.
    5. Test script
      #!/bin/bash
      MYSQL_PASS="my_pass"
      ROWS=`mysql -B -p$MYSQL_PASS test -e 'select count(*) from repligard' |
      grep -v [a-z]`
      AWKSCRIPT="BEGIN{ srand() }
      END{ for(i=1;i<=100;i++) print randint($ROWS) }
      function randint(n) {return int(n * rand()) + 1}"
      for i in `echo | awk "$AWKSCRIPT"`;
      do
      #    mysql -p$MYSQL_PASS test -e "SELECT guid,id,realm FROM repligard
      LIMIT $i,1;"
      #    mysql -p$MYSQL_PASS test -e "SELECT HEX(guid) as guid,id,realm FROM
      repligard_bin LIMIT $i,1;"
          echo "SELECT guid,id,realm FROM repligard LIMIT $i,1;"
          echo "SELECT HEX(guid) as guid,id,realm FROM repligard_bin LIMIT $i,1;"
      done
      
      This gives out a bunch (200) of SQL statements for getting (same)random rows from the tables which we then execute in the mysql shell to get the timing info (see attachment random_seeks). Now these statements are basically stupid seeks, but they show that when we get to millions of rows varchar vs varbinary starts to matter from performance POV as well.
    6. Other test script
      #!/bin/bash
      MYSQL_PASS="my_pass"
      ROWS=`mysql -B -p$MYSQL_PASS test -e 'select count(*) from repligard' |
      grep -v [a-z]`
      AWKSCRIPT="BEGIN{ srand() }
      END{ for(i=1;i<=100;i++) print randint($ROWS) }
      function randint(n) {return int(n * rand()) + 1}"
      for i in `echo | awk "$AWKSCRIPT"`;
      do
          GUID=`mysql -B -p$MYSQL_PASS test -e "SELECT guid FROM repligard
      LIMIT $i,1;" | grep -v 'guid'`
          echo "SELECT guid,id,realm FROM repligard WHERE guid='$GUID';"
          echo "SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE
      guid=UNHEX('$GUID');"
      done
      
      Another output of statements we try in mysql shell (after restarting the server to clear caches), attachment random_guids. Here we see basically no difference between the two tables.
    7. Inserts
      mysql> INSERT into repligard set guid='a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae',realm='midcom_core_login_session',id=-1;
      Query OK, 1 row affected (0.00 sec)
      
      mysql> INSERT into repligard_bin set guid=UNHEX('a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae'),realm='midcom_core_login_session',id=-1;
      Query OK, 1 row affected (0.00 sec)
      
      Inserting single rows shows no difference here (probably goes to cache/queue)
    8. Updates (server restarted between to clear caches)
      mysql> UPDATE repligard SET realm='midcom_core_login_session_upd' WHERE guid='a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae';
      Query OK, 1 row affected (0.01 sec)
      Rows matched: 1  Changed: 1  Warnings: 0
      
      mysql> UPDATE repligard_bin SET realm='midcom_core_login_session_upd' WHERE guid=UNHEX('a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae');
      Query OK, 1 row affected (0.00 sec)
      Rows matched: 1  Changed: 1  Warnings: 0
      
      Second try (again restart in between) showed no time difference here, however I enabled profiling:
      mysql> show profiles;
      +----------+------------+-----------------------------------------------------------------------------------------------------------------------------+
      | Query_ID | Duration   | Query      |
      +----------+------------+-----------------------------------------------------------------------------------------------------------------------------+
      |        1 | 0.00083000 | UPDATE repligard SET realm='midcom_core_login_session_upd' WHERE guid='a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae'            |
      |        2 | 0.00026675 | UPDATE repligard_bin SET realm='midcom_core_login_session_upd' WHERE guid=UNHEX('a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae') |
      |        3 | 0.00024950 | UPDATE repligard SET realm='midcom_core_login_session_upd' WHERE guid='a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae'            |
      |        4 | 0.00018250 | UPDATE repligard_bin SET realm='midcom_core_login_session_upd' WHERE guid=UNHEX('a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae') |
      |        5 | 0.00020675 | UPDATE repligard SET realm='midcom_core_login_session_upd' WHERE guid='a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae'            |
      |        6 | 0.00016825 | UPDATE repligard_bin SET realm='midcom_core_login_session_upd' WHERE guid=UNHEX('a1baef6c1e2b11deb3f4df0a8c25ea8eea8eaeae') |
      +----------+------------+-----------------------------------------------------------------------------------------------------------------------------+
      6 rows in set (0.00 sec)
      
      Here the binary version is consistently significantly faster, while at it I did some GUID seeks with profiling enabled:
      +----------+------------+---------------------------------------------------------------------------------------------------------------+
      | Query_ID | Duration   | Query                                                                |
      +----------+------------+---------------------------------------------------------------------------------------------------------------+
      |        1 | 0.00004125 | set profiling_history_size=100            |
      |        2 | 0.00211800 | SELECT guid,id,realm FROM repligard WHERE guid='67d07320a39e11dcb09c5d955494086f086f'                         |
      |        3 | 0.00074200 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('67d07320a39e11dcb09c5d955494086f086f') |
      |        4 | 0.00023200 | SELECT guid,id,realm FROM repligard WHERE guid='110e571c9d5111dcb7b5affb63d5b221b221'                         |
      |        5 | 0.00017700 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('110e571c9d5111dcb7b5affb63d5b221b221') |
      |        6 | 0.00077450 | SELECT guid,id,realm FROM repligard WHERE guid='817cd724bc8e11dca45e0b66222a75d875d8'                         |
      |        7 | 0.00057650 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('817cd724bc8e11dca45e0b66222a75d875d8') |
      |        8 | 0.00053175 | SELECT guid,id,realm FROM repligard WHERE guid='2df131ba895011dd89d383f0b77c3e2c3e2c'                         |
      |        9 | 0.00029725 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('2df131ba895011dd89d383f0b77c3e2c3e2c') |
      |       10 | 0.00023000 | SELECT guid,id,realm FROM repligard WHERE guid='b2175400712711de92bc3350a43365cb65cb'                         |
      |       11 | 0.00086900 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('b2175400712711de92bc3350a43365cb65cb') |
      |       12 | 0.00106700 | SELECT guid,id,realm FROM repligard WHERE guid='b91d206030d411deb29f897c8053de46de46'                         |
      |       13 | 0.00046800 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('b91d206030d411deb29f897c8053de46de46') |
      |       14 | 0.00004500 | SELECT guid,id,realm FROM repligard WHERE guid='67d07320a39e11dcb09c5d955494086f086f'                         |
      |       15 | 0.00002525 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('67d07320a39e11dcb09c5d955494086f086f') |
      |       16 | 0.00002850 | SELECT guid,id,realm FROM repligard WHEREguid='110e571c9d5111dcb7b5affb63d5b221b221'                         |
      |       17 | 0.00003000 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('110e571c9d5111dcb7b5affb63d5b221b221') |
      |       18 | 0.00002800 | SELECT guid,id,realm FROM repligard WHEREguid='817cd724bc8e11dca45e0b66222a75d875d8'                         |
      |       19 | 0.00002875 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('817cd724bc8e11dca45e0b66222a75d875d8') |
      |       20 | 0.00002725 | SELECT guid,id,realm FROM repligard WHERE guid='2df131ba895011dd89d383f0b77c3e2c3e2c'                         |
      |       21 | 0.00002825 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('2df131ba895011dd89d383f0b77c3e2c3e2c') |
      |       22 | 0.00002725 | SELECT guid,id,realm FROM repligard WHERE guid='b2175400712711de92bc3350a43365cb65cb'                         |
      |       23 | 0.00002775 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('b2175400712711de92bc3350a43365cb65cb') |
      |       24 | 0.00002700 | SELECT guid,id,realm FROM repligard WHERE guid='b91d206030d411deb29f897c8053de46de46'                         |
      |       25 | 0.00002625 | SELECT HEX(guid) as guid,id,realm FROMrepligard_bin WHERE guid=UNHEX('b91d206030d411deb29f897c8053de46de46') |
      |       26 | 0.00004475 | SELECT guid,id,realm FROM repligard WHERE guid='67d07320a39e11dcb09c5d955494086f086f'                         |
      |       27 | 0.00002875 | SELECT HEX(guid) as guid,id,realm FROMrepligard_bin WHERE guid=UNHEX('67d07320a39e11dcb09c5d955494086f086f') |
      |       28 | 0.00003350 | SELECT guid,id,realm FROM repligard WHERE guid='110e571c9d5111dcb7b5affb63d5b221b221'                         |
      |       29 | 0.00002775 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('110e571c9d5111dcb7b5affb63d5b221b221') |
      |       30 | 0.00002800 | SELECT guid,id,realm FROM repligard WHERE guid='817cd724bc8e11dca45e0b66222a75d875d8'                         |
      |       31 | 0.00002800 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('817cd724bc8e11dca45e0b66222a75d875d8') |
      |       32 | 0.00002700 | SELECT guid,id,realm FROM repligard WHERE guid='2df131ba895011dd89d383f0b77c3e2c3e2c'                         |
      |       33 | 0.00002725 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('2df131ba895011dd89d383f0b77c3e2c3e2c') |
      |       34 | 0.00002750 | SELECT guid,id,realm FROM repligard WHERE guid='b2175400712711de92bc3350a43365cb65cb'                         |
      |       35 | 0.00002675 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('b2175400712711de92bc3350a43365cb65cb') |
      |       36 | 0.00002800 | SELECT guid,id,realm FROM repligard WHERE guid='b91d206030d411deb29f897c8053de46de46'                         |
      |       37 | 0.00002550 | SELECT HEX(guid) as guid,id,realm FROM repligard_bin WHERE guid=UNHEX('b91d206030d411deb29f897c8053de46de46') |
      +----------+------------+---------------------------------------------------------------------------------------------------------------+
      
      Again the binary version is faster even when we get result from cache.

    /Rambo

    •  Reply
  8. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Wed July 22 2009 16:40:07 UTC
    Hi!

    I do not see any performance boost.
    Here are my tests:

    1481970 rows

    mysql> create table guids (guid varchar(80));
    Query OK, 0 rows affected (0.00 sec)
    mysql> insert into guids (guid) SELECT guid from repligard;
    Query OK, 1481970 rows affected (56.39 sec)
    Records: 1481970 Duplicates: 0 Warnings: 0

    mysql> create table guids_bin (guid BLOB);
    Query OK, 0 rows affected (0.01 sec)
    mysql> insert into guids_bin (guid) SELECT hex(guid) from repligard;
    Query OK, 1481970 rows affected (2.04 sec)
    Records: 1481970 Duplicates: 0 Warnings: 0

    mysql> select count(guid) from guids;
    +-------------+
    | count(guid) |
    +-------------+
    | 1481970 |
    +-------------+
    1 row in set (0.28 sec)

    mysql> select count(guid) from guids_bin;
    +-------------+
    | count(guid) |
    +-------------+
    | 1481970 |
    +-------------+
    1 row in set (0.31 sec)

    Size:

    -rw-rw---- 1 mysql mysql 8560 07-22 17:00 guids_bin.frm
    -rw-rw---- 1 mysql mysql 118471204 07-22 17:01 guids_bin.MYD
    -rw-rw---- 1 mysql mysql 1024 07-22 17:01 guids_bin.MYI
    -rw-rw---- 1 mysql mysql 8560 07-22 16:57 guids.frm
    -rw-rw---- 1 mysql mysql 65163488 07-22 17:00 guids.MYD
    -rw-rw---- 1 mysql mysql 1024 07-22 17:00 guids.MYI

    -rw-rw---- 1 mysql mysql 8,4K 07-22 17:00 guids_bin.frm
    -rw-rw---- 1 mysql mysql 113M 07-22 17:01 guids_bin.MYD
    -rw-rw---- 1 mysql mysql 1,0K 07-22 17:01 guids_bin.MYI
    -rw-rw---- 1 mysql mysql 8,4K 07-22 16:57 guids.frm
    -rw-rw---- 1 mysql mysql 63M 07-22 17:00 guids.MYD
    -rw-rw---- 1 mysql mysql 1,0K 07-22 17:00 guids.MYI

    mysql> CREATE INDEX guid_idx ON guids(guid);
    Query OK, 1481970 rows affected (3.82 sec)
    Records: 1481970 Duplicates: 0 Warnings: 0

    mysql> CREATE INDEX guid_idx ON guids_bin(guid (80));
    Query OK, 1481970 rows affected (9.60 sec)
    Records: 1481970 Duplicates: 0 Warnings: 0

    -rw-rw---- 1 mysql mysql 8,4K 07-22 17:17 guids_bin.frm
    -rw-rw---- 1 mysql mysql 113M 07-22 17:17 guids_bin.MYD
    -rw-rw---- 1 mysql mysql 113M 07-22 17:17 guids_bin.MYI
    -rw-rw---- 1 mysql mysql 8,4K 07-22 17:14 guids.frm
    -rw-rw---- 1 mysql mysql 63M 07-22 17:14 guids.MYD
    -rw-rw---- 1 mysql mysql 59M 07-22 17:14 guids.MYI

    mysql> select guid from guids limit 1 offset 765438;
    +--------------------------------------+
    | guid |
    +--------------------------------------+
    | 8186ea0039c711dd98e0bd1e904e6cc66cc6 |
    +--------------------------------------+
    1 row in set (0.46 sec)

    mysql> select guid from guids where
    guid='8186ea0039c711dd98e0bd1e904e6cc66cc6';
    +--------------------------------------+
    | guid |
    +--------------------------------------+
    | 8186ea0039c711dd98e0bd1e904e6cc66cc6 |
    +--------------------------------------+
    1 row in set (0.00 sec)

    mysql> select unhex(guid) from guids_bin limit 1 offset 765438;
    +--------------------------------------+
    | unhex(guid) |
    +--------------------------------------+
    | 8186ea0039c711dd98e0bd1e904e6cc66cc6 |
    +--------------------------------------+
    1 row in set (0.13 sec)

    mysql> select guid from guids limit 1 offset 765438;
    +--------------------------------------+
    | guid |
    +--------------------------------------+
    | 8186ea0039c711dd98e0bd1e904e6cc66cc6 |
    +--------------------------------------+
    1 row in set (0.46 sec)

    mysql> select guid from guids where
    guid='8186ea0039c711dd98e0bd1e904e6cc66cc6';
    +--------------------------------------+
    | guid |
    +--------------------------------------+
    | 8186ea0039c711dd98e0bd1e904e6cc66cc6 |
    +--------------------------------------+
    1 row in set (0.00 sec)

    mysql> show profiles;
    +----------+------------+------------------------------------------------------------------------------------------+
    | Query_ID | Duration | Query
    |
    +----------+------------+------------------------------------------------------------------------------------------+
    | 1 | 0.00035800 | select unhex(guid) from guids_bin where
    guid=hex('8186ea0039c711dd98e0bd1e904e6cc66cc6') |
    | 2 | 0.00027800 | select guid from guids where
    guid='8186ea0039c711dd98e0bd1e904e6cc66cc6' |
    +----------+------------+------------------------------------------------------------------------------------------+

    Triple records:

    mysql> insert into guids_bin (guid) SELECT hex(concat(guid, 'bintest'))
    from repligard;
    Query OK, 1481970 rows affected (1 min 39.57 sec)
    Records: 1481970 Duplicates: 0 Warnings: 0

    mysql> insert into guids_bin (guid) SELECT hex(concat(guid, 'bintest'))
    from repligard;
    Query OK, 1481970 rows affected (4 min 14.41 sec)
    Records: 1481970 Duplicates: 0 Warnings: 0

    mysql> insert into guids (guid) SELECT concat(guid, 'bintest') from
    repligard;
    Query OK, 1481970 rows affected (1 min 18.53 sec)
    Records: 1481970 Duplicates: 0 Warnings: 0

    mysql> insert into guids (guid) SELECT concat(guid, 'bintest') from
    repligard;
    Query OK, 1481970 rows affected (2 min 27.13 sec)
    Records: 1481970 Duplicates: 0 Warnings: 0

    mysql> select guid from guids where
    guid='8186ea0039c711dd98e0bd1e904e6cc66cc6';
    +--------------------------------------+
    | guid |
    +--------------------------------------+
    | 8186ea0039c711dd98e0bd1e904e6cc66cc6 |
    +--------------------------------------+
    1 row in set (0.00 sec)

    mysql> select unhex(guid) from guids_bin where
    guid=hex('8186ea0039c711dd98e0bd1e904e6cc66cc6');
    +--------------------------------------+
    | unhex(guid) |
    +--------------------------------------+
    | 8186ea0039c711dd98e0bd1e904e6cc66cc6 |
    +--------------------------------------+
    1 row in set (0.00 sec)

    Profiles;

    0.00029300 | select guid from guids where
    guid='8186ea0039c711dd98e0bd1e904e6cc66cc6' |
    0.00031500 | select unhex(guid) from guids_bin where
    guid=hex('8186ea0039c711dd98e0bd1e904e6cc66cc6') |

    Piotras
    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
  9. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Wed July 22 2009 17:00:07 UTC
    Piotr Pokora writes:
    > Hi!
    >
    > I do not see any performance boost.
    > Here are my tests:

    15 guid constraints used with IN operator:

    0.00070600 | select guid from guids where guid IN
    ('8186ea0039c711dd98e0bd1e904e6cc66cc6',
    '8186eaae313611dc8d0191189c9ae004e004',
    '8186fb3e975711dc9a983face84444684468',
    '8186fee8bb3011dc902a4d6860a894179417',
    '8187021e8b4511dcb625714486f4990d990d',
    '8187142c8a9a11db8b33c5f15e14e7a2e7a2', '818718760e7711dc9b225 |

    0.00077800 | select unhex(guid) from guids_bin where guid IN
    (hex('8186ea0039c711dd98e0bd1e904e6cc66cc6'),
    hex('8186eaae313611dc8d0191189c9ae004e004'),
    hex('8186fb3e975711dc9a983face84444684468'),
    hex('8186fee8bb3011dc902a4d6860a894179417'),
    hex('8187021e8b4511dcb625714486f4990d990d'), hex('8187142c8a9a11db8b33c |


    Every table has 4,5M records.
    Should I make another tests? I am bit confused cause I do not see and
    performance difference.

    Piotras
    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
  10. Re: [midgard-dev] Mjönir: storing GUIDs/UUIDs in binary

    Wed July 22 2009 17:15:06 UTC
    Piotr Pokora writes:
    > Piotr Pokora writes:
    >> Hi!
    >>
    >> I do not see any performance boost.
    >> Here are my tests:
    >

    One more interesting thing:

    -rw-rw---- 1 mysql mysql 8,4K 07-22 19:05 guids_bin.frm
    -rw-rw---- 1 mysql mysql 385M 07-22 19:05 guids_bin.MYD
    -rw-rw---- 1 mysql mysql 292M 07-22 19:06 guids_bin.MYI
    -rw-rw---- 1 mysql mysql 8,4K 07-22 17:14 guids.frm
    -rw-rw---- 1 mysql mysql 198M 07-22 18:06 guids.MYD
    -rw-rw---- 1 mysql mysql 243M 07-22 18:12 guids.MYI

    The more guids stored as binary the larger table is.

    Piotras
    _______________________________________________
    dev mailing list
    dev@lists.midgard-project.org
    http://lists.midgard-project.org/mailman/listinfo/dev
    •  Reply
1 2 3 4 next »
Designed by Nemein, hosted by Kafit