[IQUG] Slower and slower performances at IQ version 16.0 while many more concurrent jobs were run in parallel

John Anthony chirayithaj at gmail.com
Wed Sep 12 09:04:43 MST 2018


1. This depends on the vintage of your storage, its architecture, and how
it is laid out

For the purposes of contrast, in the old days there was segmented access
paths to storage i.e. different datafiles went through - or to -  different
controllers / disks et al. Thus placing your datafiles of different access
path and striping over them was critical.

IQ by default assumes this kind of a lay out even today and therefore
stripes across all the data files it has access to for a given object.

These days , due to improvement is h/w capacity, striping ,maturing of OS
implementations like bonding of devices / controller, it is typical for
storage to be lumped together in to massive containers, without any
significant benefit to where and how read/writes happens, large IO to a
single file gets you about the same as small I/Os spread over multiple
files. Also, the fronting of IO by SSD type devices in most storage
appliances makes this more the case.

So look at your storage to see if file segmentation makes any difference
but increasingly the difference is getting minimal.

For the same reason that storage segmentation is losing its appeal -
technology is available to make it easier to administer and provides more
features - it is becoming attractive to run IQ with few dbfiles.

2. given response to 1 above, it makes no sense to have separate dbspaces
for the sake of it, unless you are in dealing with vintage hardware.
Separate dbspaces for IQ organization and administrative convenience makes
sense!

-JA

On Wed, Sep 12, 2018 at 10:31 AM Pankaj Pathak <pathak1611 at gmail.com> wrote:

>
> Hi All,
>
> Just a question in this mail chain regarding slowness is,
>
> 1) Does number of datafiles in a huge environment will impact the
> read(select) and writes(loads/inserts) ?
>
> For example, if a 1 PB database size consist of a user dbspace, I say 300
> datafiles. Does this count(300) will impact the read write performance,
> while comparing with the same setup of 1 PB database/1 User DBSpace/100
> DBFiles ?
>
> 2) If I divide this 1 User DBSpace into, I say 5 User DBSpaces dividing
> the files in the same proportion, will I gain performance ?
>
> Regards,
>
>
>
> On Mon, Sep 10, 2018 at 9:56 PM Steve Shen <sshen at sscinc.com> wrote:
>
>> Hi Mark,
>>
>>
>>
>> Thank you very much.
>>
>>
>>
>> Regards
>>
>> Steve Shen
>>
>> *From:* Mumy, Mark [mailto:mark.mumy at sap.com]
>> *Sent:* Monday, September 10, 2018 11:56 AM
>> *To:* Steve Shen; Baker, Chris; 'iqug at iqug.org'
>> *Cc:* 'Mark Mumy'; 'Ron Watkins'
>> *Subject:* Re: Slower and slower performances at IQ version 16.0 while
>> many more concurrent jobs were run in parallel
>>
>>
>>
>> Max_Plans_Cached controls how many query plans a connection can keep
>> around.
>>
>>
>>
>> Using the proper manuals….
>>
>>
>>
>>
>> https://help.sap.com/viewer/a894a54d84f21015b142ffe773888f8c/16.1.3.0/en-US/3bd7ffa56c5f10149bd9de084889e1b1.html?q=max_plans_cached
>>
>>
>>
>> Over time, a given connection may acquire cached plans that can optimize
>> performance for the execution of a number of services. The number of cached
>> plans can be controlled using the max_plans_cached option.
>>
>>
>>
>>
>> https://help.sap.com/viewer/a898e08b84f21015969fa437e89860c8/16.1.3.0/en-US/3bc695906c5f10148aaabe9d8f52d0e5.html?q=max_plans_cached
>>
>> The maximum number of execution plans to be stored in a cache. *This
>> property corresponds to the max_plans_cached option*.
>>
>>
>>
>> Mark
>>
>>
>>
>> *Mark Mumy*
>>
>> *Strategic Technology Incubation Group*
>>
>> *SAP Global Center of Excellence *|  *SAP*
>>
>> M +1 347-820-2136 <(347)%20820-2136> | E mark.mumy at sap.com
>>
>> My Blogs: https://blogs.sap.com/author/markmumy/
>>
>>
>>
>> https://sap.na.pgiconnect.com/I825063
>>
>> Conference tel: 18663127353 <(866)%20312-7353>,,8035340905
>> <(803)%20534-0905>#
>>
>>
>>
>> *From: *Steve Shen <sshen at sscinc.com>
>> *Date: *Monday, September 10, 2018 at 10:46 AM
>> *To: *Mark Mumy <mark.mumy at sap.com>, Chris Baker <c.baker at sap.com>, "
>> iqug at iqug.org" <iqug at iqug.org>
>> *Cc: *"markdmumy at gmail.com" <markdmumy at gmail.com>, Ron Watkins <
>> rwatkins at dssolutions.com>
>> *Subject: *RE: Slower and slower performances at IQ version 16.0 while
>> many more concurrent jobs were run in parallel
>>
>>
>>
>> It’ confusing that there was a PUBLIC database option called
>> max_plans_cached:
>>
>> PUBLIC
>>
>>
>>          max_plans_cached
>>
>>
>>          40
>>
>>
>>
>> “max_plans_cached” was also a connection property:
>>
>>
>> http://infocenter.sybase.com/help/index.jsp?topic=/com.sybase.infocenter.dc38151.1604/doc/html/saiq-connection-properties.html
>>
>>
>>
>> Regards,
>>
>> Steve Shen
>>
>> *From:* Steve Shen
>> *Sent:* Monday, September 10, 2018 11:28 AM
>> *To:* 'Mumy, Mark'; Baker, Chris; iqug at iqug.org
>> *Cc:* 'Mark Mumy'; 'Ron Watkins'
>> *Subject:* RE: Slower and slower performances at IQ version 16.0 while
>> many more concurrent jobs were run in parallel
>>
>>
>>
>> I am suspecting that the performance bottleneck was caused by the
>> following database option:
>>
>>
>>
>> max_plans_cached: The current database option value was 20.
>>
>>
>>
>> This number seemed too low.  I could reset the value to 40 the following
>> way successfully; but it’s a static change and it will need the IQ to be
>> bounced:
>>
>>
>>
>> sybase at ykt1eiquat37: /siqsoftware/sybase/release ==> isql -S EIQCGSUAT
>> -U DBA
>>
>> Password:
>>
>> 1> set option PUBLIC.max_plans_cached=40
>>
>> 2> go
>>
>> 1>
>>
>>
>>
>> Why couldn’t I reset it the following way successfully?
>>
>>
>>
>> sybase at ykt1eiquat37: /siqsoftware/sybase/release ==> isql -S EIQCGSUAT
>> -U DBA
>>
>> Password:
>>
>> 1> call sa_server_option ('max_plans_cached', '40')
>>
>> 2> go
>>
>> Msg 155, Level 15, State 0:
>>
>> SQL Anywhere Error -200: Invalid option 'max_plans_cached' -- no PUBLIC
>> setting
>>
>> exists
>>
>> 1> call sa_server_option ('PUBLIC.max_plans_cached', '40')
>>
>> 2> go
>>
>> Msg 155, Level 15, State 0:
>>
>> SQL Anywhere Error -200: Invalid option 'PUBLIC.max_plans_cached' -- no
>> PUBLIC
>>
>> setting exists
>>
>>
>>
>> Can I change it dynamically without rebooting the IQ server?
>>
>>
>>
>> Thanks and regards,
>>
>> Steve Shen
>>
>> *From:* Mumy, Mark [mailto:mark.mumy at sap.com <mark.mumy at sap.com>]
>> *Sent:* Sunday, September 09, 2018 5:36 PM
>> *To:* Steve Shen; Baker, Chris; iqug at iqug.org
>> *Cc:* 'Mark Mumy'; 'Ron Watkins'
>> *Subject:* Re: Slower and slower performances at IQ version 16.0 while
>> many more concurrent jobs were run in parallel
>>
>>
>>
>> And ASE is not IQ.  So what works well for ASE may not work well with
>> IQ.  In fact, I am fairly certain it won’t.  ASE uses significantly smaller
>> IOs than IQ and pushes a lot less data than IQ does.
>>
>>
>>
>> I go back to your original point.  You said that this worked under IQ
>> 15.4 on raw devices.  Two big changes.  To determine what the issue is, you
>> either need to drop back to a version that worked well (and use
>> filesystems) or revert to raw devices and test the current version of IQ.
>>
>>
>>
>> Mark
>>
>>
>>
>> *Mark Mumy*
>>
>> *Strategic Technology Incubation Group*
>>
>> *SAP Global Center of Excellence *|  *SAP*
>>
>> M +1 347-820-2136 <(347)%20820-2136> | E mark.mumy at sap.com
>>
>> My Blogs: https://blogs.sap.com/author/markmumy/
>>
>>
>>
>> https://sap.na.pgiconnect.com/I825063
>>
>> Conference tel: 18663127353 <(866)%20312-7353>,,8035340905
>> <(803)%20534-0905>#
>>
>>
>>
>> *From: *Steve Shen <sshen at sscinc.com>
>> *Date: *Saturday, September 8, 2018 at 4:22 PM
>> *To: *Mark Mumy <mark.mumy at sap.com>, Chris Baker <c.baker at sap.com>, "
>> iqug at iqug.org" <iqug at iqug.org>
>> *Cc: *"markdmumy at gmail.com" <markdmumy at gmail.com>, Ron Watkins <
>> rwatkins at dssolutions.com>
>> *Subject: *RE: Slower and slower performances at IQ version 16.0 while
>> many more concurrent jobs were run in parallel
>>
>>
>>
>> Hi Mark,
>>
>>
>>
>> I missed reading your reply email until today.  Thank you very much.
>>
>>
>>
>> It’s very difficult to convince the senior management to not use file
>> systems because SSD based file systems  on hosting ASE production servers
>> at Linux seemed working very well.  We used to have poor performance issues
>> while using SAN based file systems at Solaris version 10; but it’s not so
>> anymore using SSD based file systems at Linux, Kernel 2.6 or Kernel 2.7.
>>
>>
>>
>> I am very sure that Red Hat 7.4 using xfs file systems was using Direct
>> I/O because we did specify ASE devices with “direction=true” and the ASE
>> log file did indicate that DIRECTIO was effective.
>>
>> The following were just one example on a given ASE server:
>>
>>
>>
>> 00:0000:00000:00002:2018/09/07 23:08:07.90 kernel  Virtual device 86
>> started using asynchronous (with DIRECTIO) i/o.
>>
>> 00:0000:00000:00002:2018/09/07 23:08:07.90 kernel  Virtual device 87
>> started using asynchronous (with DIRECTIO) i/o.
>>
>>
>>
>> Besides the DIRECTIO was automatically taken care of by the latest system
>> patch at IQ version 16.0 at Red Hat 7.4.
>>
>>
>>
>> I will speak to UNIX SAs next week on your recommendations below:
>>
>>>>
>> You could also try rebuilding the filesystem with larger block sizes.
>> Use a block size that matches the IQ block size (not page size).
>>
>> mkfs.xfs -b size= 8192 /dev/sdXXXXXX
>>
>> ·         Used 16384 assuming you had the default IQ layout of a 128k
>> page which forces an 8k block
>>
>> ·         Adjust if your page size is higher
>>
>>
>>
>> Measure filesystem fragmentation to see if this is the cause:
>>
>> xfs_db -c frag -r /dev/sdXXXXXX
>>
>>
>>
>> Expanding Chris’ mount options:
>>
>> ·         nodiratime,relatime,noatime
>>
>>>>
>>
>>
>> What frustrated me was UNIX SA managers sometimes could not tell me the
>> block size on SAN or SSD.  And they told me that those SSD disks seemed one
>> big disk to them.  They had no way to dedicate some SSD disks for data
>> store and some other SSD disks for temp store.  They claimed that having
>> one file system vs. multiple file systems made no difference in terms of
>> performances. We used to segregate ASE data file system from the log file
>> system while using SAN disks; but nowadays it’s just one file system for
>> data and temporary database devices using SSD! Were these statements really
>> true?
>>
>>
>>
>> Most of the file systems mentioned on Linux or Solaris technical
>> documents were based on traditional local disks or SAN disks.  They might
>> not be applicable to SSDs; but I am not the expert in SSDs.  I am waiting
>> for your feedbacks:  Does it really matter to have one big file system with
>> multiple files created on data and temp store files vs. having multiple
>> file systems on data and temp store files in SSD storage environments?  For
>> example I can have one big file system having 1 SYSTEM_MAIN, 3 SYSTEM_TEMP
>> and 10 USER_MAIN data store files.  Or I can create 14 file systems.  Will
>> the idea of having 14 file systems definitely perform better than one file
>> system in SSD storage devices?  BTW, SSDs were shared for ASE servers and
>> IQ servers.
>>
>>
>>
>> Regards,
>>
>> Steve Shen
>>
>> *From:* Mumy, Mark [mailto:mark.mumy at sap.com <mark.mumy at sap.com>]
>> *Sent:* Friday, September 07, 2018 12:54 PM
>> *To:* Baker, Chris; Steve Shen; iqug at iqug.org
>> *Cc:* 'Mark Mumy'; 'Ron Watkins'
>> *Subject:* Re: Slower and slower performances at IQ version 16.0 while
>> many more concurrent jobs were run in parallel
>>
>>
>>
>> Just to add to Chris’ comments….
>>
>>
>>
>> You have to make sure that the filesystem support direct IO and that IQ
>> is able to open the devices with direct IO.  This “should” bypass the
>> cache.  It sounds like it is either the filesystem block size that is
>> blocking things up or that the filesystem cache is being used and it
>> getting in the way of things.
>>
>>
>>
>> As a sanity check, can you run this same test on IQ with raw devices?  If
>> you can do that, it will tell us where the issue is.  We think it is the
>> filesystem, and that will certainly show if it is.  Without a test on raw
>> devices it is total speculation.
>>
>>
>>
>> You could also try rebuilding the filesystem with larger block sizes.
>> Use a block size that matches the IQ block size (not page size).
>>
>> mkfs.xfs -b size= 8192 /dev/sdXXXXXX
>>
>> ·         Used 16384 assuming you had the default IQ layout of a 128k
>> page which forces an 8k block
>>
>> ·         Adjust if your page size is higher
>>
>>
>>
>> Measure filesystem fragmentation to see if this is the cause:
>>
>> xfs_db -c frag -r /dev/sdXXXXXX
>>
>>
>>
>> Expanding Chris’ mount options:
>>
>> ·         nodiratime,relatime,noatime
>>
>>
>>
>> Also, tell the senior manager that “supported” is not the same as
>> performant.  We support single row inserts, too.  Would they want to change
>> all your bulk loading to single INSERT statements?  No!  While supported,
>> it will not perform at the same level.
>>
>>
>>
>> Lastly…. How is the filesystem made up?  Does it point to just 1 LUN?
>> Multiple LUNs on the SAN?  Was it created as concatenated or striped?  All
>> of this can severely alter the performance of the filesystem.
>>
>>
>>
>> Mark
>>
>>
>>
>> *Mark Mumy*
>>
>> *Strategic Technology Incubation Group*
>>
>> *SAP Global Center of Excellence *|  *SAP*
>>
>> M +1 347-820-2136 <(347)%20820-2136> | E mark.mumy at sap.com
>>
>> My Blogs: https://blogs.sap.com/author/markmumy/
>>
>>
>>
>> https://sap.na.pgiconnect.com/I825063
>>
>> Conference tel: 18663127353 <(866)%20312-7353>,,8035340905
>> <(803)%20534-0905>#
>>
>>
>>
>> *From: *Chris Baker <c.baker at sap.com>
>> *Date: *Friday, September 7, 2018 at 11:35 AM
>> *To: *Steve Shen <sshen at sscinc.com>, "iqug at iqug.org" <iqug at iqug.org>
>> *Cc: *"markdmumy at gmail.com" <markdmumy at gmail.com>, Mark Mumy <
>> mark.mumy at sap.com>, Ron Watkins <rwatkins at dssolutions.com>
>> *Subject: *RE: Slower and slower performances at IQ version 16.0 while
>> many more concurrent jobs were run in parallel
>>
>>
>>
>> IMHO and AFAIK this is one of the performance penalties of using a file
>> system.  The filesystem often a smaller ‘segment size’ than the IQ
>> block/page size.  The net effect is to break up a single IQ I/O into
>> multiple filesystem I/Os.
>>
>>
>>
>> You are not stating how the filesystems are mounted or what type they
>> are, or whether you have large enough IQ main and temp caches as well –
>> which can help.
>>
>>
>>
>> Not much you can do about fs performance, but you can look at the Linux
>> elevator scheduler for the fs block device (if it is not on a SAN or NFS,
>> etc – sometimes the scheduler is not available under some types of block
>> devices) and set it to ‘noop’ or ‘deadline’ to see if it helps.
>>
>>
>>
>> For EXT3/4 filesystems, you can also look at turning off journaling,
>> which can slow the filesystem I/O down considerably.  e.g. for EXT4,
>> consider mounting the filesystem with additional ‘noatime,nodiratime’
>> options.  I think there may also be other options to turn off journaling,
>> that can be provided by some of the other respondents.
>>
>>
>>
>> Chris
>>
>>
>>
>> *Chris Baker* | Database Engineering Evangelist | CERT | PI HANA
>> Platform Data Management | SAP
>>
>> T +1 416-226-7033 <+1%20416-226-7033> | M +1 647-224-2033
>> <+1%20647-224-2033> | TF +1 866-716-8860 <+1%20866-716-8860>
>>
>> *SAP Canada Inc. *445 Wes Graham Way, Waterloo, N2L 6R2
>>
>> c.baker at sap.com | www.sap.com
>>
>>
>>
>> https://sap.na.pgiconnect.com/I826572
>>
>> Conference tel: 1-866-312-7353,,9648565377#
>>
>>
>>
>> *From:* Steve Shen <sshen at sscinc.com>
>> *Sent:* Friday, September 7, 2018 12:23 PM
>> *To:* iqug at iqug.org
>> *Cc:* 'Mark Mumy' <markdmumy at gmail.com>; Mumy, Mark <mark.mumy at sap.com>;
>> Baker, Chris <c.baker at sap.com>; 'Ron Watkins' <rwatkins at dssolutions.com>;
>> Steve Shen <sshen at sscinc.com>
>> *Subject:* Slower and slower performances at IQ version 16.0 while many
>> more concurrent jobs were run in parallel
>>
>>
>>
>> Today topic:
>>
>>                 Changing from fast to slower and slower IQ 16.0
>> performance at Linux Red Hat 7.4 using file systems
>>
>>
>>
>> Let me share my IQ environments with you:
>>
>>
>>
>>    1. The O/S is Red Hat version 7.4.
>>    2. he IQ version is “SAP IQ/16.0.110.2805/11355/P/sp11.20/Enterprise
>>    Linux64 - x86_64 - 2.6.18-194.el5/64bit/2018-06-15 02:57:16”.
>>    3. The senior manager overrode my concerns on its potential
>>    performance penalty and decided using file systems because SAP manuals
>>    documented that file systems were supported at IQ.
>>    4. The file systems used SSDs.
>>
>>
>>
>> My clients were complaining about the following:
>>
>> 1.       The average reporting job duration on the similar SQL
>> statements was 395 seconds while 5 concurrent jobs were run in parallel.
>>
>> 2.       It became 475 seconds while 11 concurrent jobs were run in
>> parallel.
>>
>> 3.       It became 666 seconds while 20 concurrent jobs were run in
>> parallel.
>>
>> 4.       Many more concurrent parallel runs will take place in the next
>> two weeks!
>>
>>
>>
>> In other words this IQ performance ran very fast when there were few jobs
>> running concurrently; but the performance was declining heavily while many
>> more concurrent jobs were run in parallel!
>>
>>
>>
>> My key clients could not accept this kind of declining performance at IQ
>> version 16.0.
>>
>>
>>
>> I have turned off both OS_File_Cache_Buffering and
>> OS_File_Cache_Buffering_Tempdb.  I also set IQ_USE_DIRECTIO=1 even though
>> it’s not necessary at Red Hat 7.4.
>>
>>
>>
>> What else can I do to make the first IQ instance at Linux using file
>> systems to perform much better in many concurrent parallel runs?
>>
>>
>>
>> Thank you.
>>
>>
>>
>> Regards,
>>
>>
>>
>> Steve Shen
>>
>>
>>
>> This email with all information contained herein or attached hereto may
>> contain confidential and/or privileged information intended for the
>> addressee(s) only. If you have received this email in error, please contact
>> the sender and immediately delete this email in its entirety and any
>> attachments thereto.
>>
>> This email with all information contained herein or attached hereto may
>> contain confidential and/or privileged information intended for the
>> addressee(s) only. If you have received this email in error, please contact
>> the sender and immediately delete this email in its entirety and any
>> attachments thereto.
>>
>> This email with all information contained herein or attached hereto may
>> contain confidential and/or privileged information intended for the
>> addressee(s) only. If you have received this email in error, please contact
>> the sender and immediately delete this email in its entirety and any
>> attachments thereto.
>> This email with all information contained herein or attached hereto may
>> contain confidential and/or privileged information intended for the
>> addressee(s) only. If you have received this email in error, please contact
>> the sender and immediately delete this email in its entirety and any
>> attachments thereto.
>> _______________________________________________
>> IQUG mailing list
>> IQUG at iqug.org
>> http://iqug.org/mailman/listinfo/iqug
>
> _______________________________________________
> IQUG mailing list
> IQUG at iqug.org
> http://iqug.org/mailman/listinfo/iqug
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://iqug.org/pipermail/iqug/attachments/20180912/ff00d0e8/attachment-0001.html>


More information about the IQUG mailing list