Sunday, March 13, 2011

What is my index creation doing?


As part of our new application release based on the row selectivity estimates and for performance reasons we decided to create an index on table named test. The table in discussion had 36 million rows and the size was around 7GB. One of my colleague started the index creation process with the following command.
"create index indx_test on test(advertiser_id);"


On firing the command the screen looked like as though the command was running. It was running for more than 20 minutes and my colleague had no idea when the index creation is going to finish because PostgreSQL doesn't print any progress messages. Now my colleague's manager posted a question to him. Is the index creation runnning and what exactly it's doing? For the first part of the question my colleague figured an answer by running
 "select * from pg_stat_activity where current_query ilike 'create index %'"
and the command was indeed running because the column waiting was equal to false.

Puzzled about the second part of the question my colleague called me for help. Since PostgreSQL doesn't print progress messages we decided to use strace for our rescue. On stracing(strace -p 21060) the process we found the process was doing a lot of read and write calls. On closer look at the arguments for read and write system calls we found the File Descriptors the process was using. To be more precise
the numbers were 17(read call) and 21(write call).


The backend pid of the job was 21060 and we changed our directory to /proc/21060/fd. On executing ls -l on 17 and 21 the output pointed us to relfilenode(498021.4) and a temporary file it was creating for index. On looking at the entry in pg_class for the same refilenode we ended with table named test. Actually the command was reading the 5th GB of the 7GB files since the table was around 7GB in size. In postgresql based on the way you compile the server a new file is created for each additinal GB for the
same table/index with same relfilenode with .1,.2 as extensions. For more details refer to http://www.postgresql.org/docs/8.3/static/storage-file-layout.html

Armed with the above knowledge my colleague informed his manager that index creation would finish approximately in another 20 - 25 minutes since initially it took 20 minutes to read 4GB of the file from the start(1GB = 5mins) and approximately it still needs to read another 4GB( 7GB(table size) + 1GB temporary file + creation of new index file = 45 minutes approx.) . The above timings are all rough estimates.
The system in discussion had low IO throughput because 5 minutes for reading 1GB of file is really on the higher side. I also suggested my colleague to look whats happening with the IO on the system.
   Long story short strace along with File Descriptors and relfilenode revealed what index creation process was exactly doing.


4 comments:

  1. Loved this article. I wrote a bash/psql script to automate all of this for my DBAs to use. I'll toss it up on GItHub if anyone thinks the community would want it.

    ReplyDelete
  2. https://github.com/cdhouch/pgprogress.git

    ReplyDelete
    Replies
    1. Dear Eli, I've found your bash script wonderful! Is it possible to check for an error it shows in my case?

      Delete
  3. Not Able to Write Index Query in PostgreSQL? Contact to Postgres SQL Support for Windows
    Generally new clients who are working with PostgreSQL database they faces this sort of issue. Well! Try not to freeze simply take a profound inhale, on the grounds that we let you know how you can undoubtedly compose file question in PostgreSQL database. Nonetheless, aside from this issue in the event that you locate some other multifaceted nature or basic issue then you can contact to PostgreSQL Relational Database Service or Remote PostgreSQL Performance Service. For Index Query: SELECT * from TabEmp WITH (INDEX (idx_TabEMP)). Just along these lines you can compose the record inquiry. For any issue or question contact to Postgres SQL Support for Linux.
    For More Info: https://cognegicsystems.com/
    Contact Number: 1-800-450-8670
    Email Address- info@cognegicsystems.com
    Company’s Address- 507 Copper Square Drive Bethel Connecticut (USA) 06801

    ReplyDelete