[IQUG] [iqug] Sharding my largest IQ table approach

Louie, David David.Louie at blackrock.com
Mon Oct 14 09:04:22 MST 2019


Yes thanks Mark, Ron.

We are identifying the cases where were we would need to join the view to tables and will rewrite the queries.

David


From: Mumy, Mark [mailto:mark.mumy at sap.com]
Sent: Thursday, October 10, 2019 10:42 AM
To: Ron Watkins <rwatkins at dssolutions.com>; Louie, David <David.Louie at blackrock.com>; 'Mark Mumy' <markdmumy at gmail.com>
Cc: iqug at dssolutions.com
Subject: Re: [IQUG] [iqug] Sharding my largest IQ table approach


External Email: Use caution with links and attachments
This gets back to the basics of how we develop queries in the HANA world with very large data sets.  Not something we are used to with IQ, but it works quite well.

Rather than having the users call a query, have them call a procedure.  In the proc, you can then wrap all this logic up nice and neat to do what Ron describes.

Mark

Mark Mumy
SAP Platform and Technologies Global Center of Excellence
M +1 347-820-2136 | E mark.mumy at sap.com<mailto:mark.mumy at sap.com>
My Blogs: https://blogs.sap.com/author/markmumy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blogs.sap.com_author_markmumy_&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=LPuRpJI_sAJfyNpn-BlRThhsZZKv8LdK9JO9gBMIOME&s=GqAdlehpXSLXeepKIVE8KLmwaVQ57hNfiHDsoiicCjA&e=>

https://sap.na.pgiconnect.com/I825063<https://urldefense.proofpoint.com/v2/url?u=https-3A__sap.na.pgiconnect.com_I825063&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=LPuRpJI_sAJfyNpn-BlRThhsZZKv8LdK9JO9gBMIOME&s=HBz2ttz6fa4A2LnYX8Hz9hHMbCjfXV5rU0TcbpgpaYo&e=>
Conference tel: 18663127353,,8035340905#

From: Ron Watkins <rwatkins at dssolutions.com<mailto:rwatkins at dssolutions.com>>
Date: Wednesday, October 9, 2019 at 3:08 PM
To: David Louie <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>, Mark Mumy <mark.mumy at sap.com<mailto:mark.mumy at sap.com>>, "markdmumy at gmail.com<mailto:markdmumy at gmail.com>" <markdmumy at gmail.com<mailto:markdmumy at gmail.com>>
Cc: IQ Group <iqug at dssolutions.com<mailto:iqug at dssolutions.com>>
Subject: RE: [IQUG] [iqug] Sharding my largest IQ table approach

Making a multiple step query where you query individual table segments with the results being added to a temp table, then you can do basic selects, groups or distincts from the temp table.
Ron

From: iqug-bounces at iqug.org<mailto:iqug-bounces at iqug.org> [mailto:iqug-bounces at iqug.org] On Behalf Of Louie, David
Sent: Wednesday, October 9, 2019 9:52 AM
To: Mumy, Mark; Mark Mumy
Cc: iqug at dssolutions.com<mailto:iqug at dssolutions.com>
Subject: Re: [IQUG] [iqug] Sharding my largest IQ table approach

Hi Mark,

I just tested selects on the union all view and seeing for the most part all select (singleton) are running around the same time as the table selects.

As you predicted running joins which involve the view and other tables are not running at all.   The join version to the actual table took secs and the joins to the view hangs for 1 hour and I killed it as it’s apparent it’s not going to work.

Is there any recourse to this other than to not join to the view?

Thanks
David


From: Mumy, Mark [mailto:mark.mumy at sap.com]
Sent: Thursday, August 15, 2019 4:56 PM
To: Louie, David <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>; Mark Mumy <markdmumy at gmail.com<mailto:markdmumy at gmail.com>>
Cc: iqug at dssolutions.com<mailto:iqug at dssolutions.com>
Subject: Re: [IQUG] [iqug] Sharding my largest IQ table approach


External Email: Use caution with links and attachments
OK, so that’s good then.  You eliminate most of the complexities with sharding.  You could still get into some performance trouble, but it is much rarer when you don’t have joins.

Mark

Mark Mumy
SAP Platform and Technologies Global Center of Excellence
M +1 347-820-2136 | E mark.mumy at sap.com<mailto:mark.mumy at sap.com>
My Blogs: https://blogs.sap.com/author/markmumy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blogs.sap.com_author_markmumy_&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=SmTPBmBfWc2phwaZexVRx5mjJlXZUGwIUQse6bC5Fk0&s=IqPYxfejrKwu-6zNNBjCBuaarox0HU11FK1_PKP9S_M&e=>

https://sap.na.pgiconnect.com/I825063<https://urldefense.proofpoint.com/v2/url?u=https-3A__sap.na.pgiconnect.com_I825063&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=SmTPBmBfWc2phwaZexVRx5mjJlXZUGwIUQse6bC5Fk0&s=VDmJuosymKK0Q_sKFRzuzaW_H6-dJdxItvNY4D2Da-U&e=>
Conference tel: 18663127353,,8035340905#

From: David Louie <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>
Date: Thursday, August 15, 2019 at 3:50 PM
To: Mark Mumy <mark.mumy at sap.com<mailto:mark.mumy at sap.com>>, "markdmumy at gmail.com<mailto:markdmumy at gmail.com>" <markdmumy at gmail.com<mailto:markdmumy at gmail.com>>
Cc: IQ Group <iqug at dssolutions.com<mailto:iqug at dssolutions.com>>
Subject: RE: [IQUG] [iqug] Sharding my largest IQ table approach

I see Mark.  Fortunately the table we want to shard will never require a join.

However we will look into the 16.1 zone map feature.

Thanks
David


From: Mumy, Mark [mailto:mark.mumy at sap.com]
Sent: Thursday, August 15, 2019 4:39 PM
To: Louie, David <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>; Mark Mumy <markdmumy at gmail.com<mailto:markdmumy at gmail.com>>
Cc: iqug at dssolutions.com<mailto:iqug at dssolutions.com>
Subject: Re: [IQUG] [iqug] Sharding my largest IQ table approach


External Email: Use caution with links and attachments
Sharding is simply taking the large object and splitting it up into smaller ones.  Doesn’t matter if it is 2 or 200.  These shards are stiched back together, as you describe, via a UNION ALL view.

Suppose you need to join a table to that view:  select * from small_table, shard_view where small_table.field = shard_view.field

Fairly simple query.  The only way to perform this, though, is to join the small_table to each and every table in the shard_view.  OK, not so bad as IQ tries to do this in parallel.

But what if I change the query slightly:
select *, count(*) from small_table, shard_view where small_table.field = shard_view.field
group by small_table.field2

Now, IQ has to group the sharded data by a field that the sharded data is not sharded by.  This is an immensely more complex query that will take a lot of time, compute, and store resources.

No matter what, sharding can lead to poor performance.  With my example, though, we can help alleviate some of that impact by making sure that nearly all data is in a single table.

Ultimately, though, using IQ 16.1 SP03 or SP04 would allow for the best flexibility as you wouldn’t be forced to shard the table.  You could possibly use the zone map feature and reduce the HG index count to keep load performance up without losing performance.

Mark

Mark Mumy
SAP Platform and Technologies Global Center of Excellence
M +1 347-820-2136 | E mark.mumy at sap.com<mailto:mark.mumy at sap.com>
My Blogs: https://blogs.sap.com/author/markmumy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blogs.sap.com_author_markmumy_&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=zOseCgizu2EhwiBzcMUTPHQlGNDNmxZePytUl-NzqNQ&s=iuMdUfSJkLiE3U0aa6Gqzlh46m2LRffgPCb1DNR_qmI&e=>

https://sap.na.pgiconnect.com/I825063<https://urldefense.proofpoint.com/v2/url?u=https-3A__sap.na.pgiconnect.com_I825063&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=zOseCgizu2EhwiBzcMUTPHQlGNDNmxZePytUl-NzqNQ&s=vLBGv7k2k8IC1cw6b5eywwTN4feyopEP675HQ1mY2pk&e=>
Conference tel: 18663127353,,8035340905#

From: David Louie <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>
Date: Thursday, August 15, 2019 at 3:27 PM
To: Mark Mumy <mark.mumy at sap.com<mailto:mark.mumy at sap.com>>, "markdmumy at gmail.com<mailto:markdmumy at gmail.com>" <markdmumy at gmail.com<mailto:markdmumy at gmail.com>>
Cc: IQ Group <iqug at dssolutions.com<mailto:iqug at dssolutions.com>>
Subject: RE: [IQUG] [iqug] Sharding my largest IQ table approach

Thanks Mark.

What is the difference between ‘sharding’ and the current and history approach which I originally suggested.

Would sharding imply more than 2 tables in the view which brings them all together and hence would be more efficient for queries vs sharding across more than 2 tables?

-David


From: Mumy, Mark [mailto:mark.mumy at sap.com]
Sent: Thursday, August 15, 2019 4:08 PM
To: Louie, David <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>; Mark Mumy <markdmumy at gmail.com<mailto:markdmumy at gmail.com>>
Cc: iqug at dssolutions.com<mailto:iqug at dssolutions.com>
Subject: Re: [IQUG] [iqug] Sharding my largest IQ table approach


External Email: Use caution with links and attachments
I would skip 16.0 and go to 16.1.  16.0 goes end of life quite soon, so that in the next 6 months you will be doing another upgrade.

According to the IQ PAM<https://urldefense.proofpoint.com/v2/url?u=https-3A__apps.support.sap.com_sap-28bD1lbiZjPTAwMQ-3D-3D-29_support_pam_pam.html-3Fsmpsrv-3Dhttps-253A-252F-252Fwebsmp102.sap-2Dag.de-23pvnr-3D67838200100900005877-26pt-3Dg-257Cd-26ainstnr-3D67838200104900016656-26ts-3D0&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=574XgbVqNTENu3ZFuUIvW2zjimZuqYsJzBl1BPGoxIE&s=DCuRg7RBQ6WoC9-e0IXLeTJLAWpGgjTG8Pwkb2FbjD0&e=>:
End of Mainstream Maintenance:

2020-11-30



IQ 16.1 SP03 and later has a feature called Zone Maps.  One of the reasons that your loads get slower is because of HG indexes.  We’ve made strides with these over the years, but they can still get in the way.  We introduced Zone Maps as a way to help with large tables to the point where HG indexes may not be needed.  If you reduce the number of HG indexes, you reduce the need to shard the table.

Sharding will always come with a performance hit as the data has to be stitched back together at runtime for each user.  With some users and queries, the impact may be negligible.  For others, it may be a significant hit.

Another approach would be to do what you propose with a current and history table.  The difference being this.  Data is always loaded into CURRENT.  And there is a view that joins CURRENT and HISTORY.  The difference is that at some set interval, maybe hourly or daily, the CURRENT table data is moved into HISTORY and then removed from CURRENT.  This way most query operations are focused on the data in HISTORY and CURRENT is small so that loads are fast.  It does require a maintenance window to move the data, though.

Mark

Mark Mumy
SAP Platform and Technologies Global Center of Excellence
M +1 347-820-2136 | E mark.mumy at sap.com<mailto:mark.mumy at sap.com>
My Blogs: https://blogs.sap.com/author/markmumy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blogs.sap.com_author_markmumy_&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=574XgbVqNTENu3ZFuUIvW2zjimZuqYsJzBl1BPGoxIE&s=A9xlJjchiukY6iSl8O5VtAybhfWTSBev-HP6pDAcdVg&e=>

https://sap.na.pgiconnect.com/I825063<https://urldefense.proofpoint.com/v2/url?u=https-3A__sap.na.pgiconnect.com_I825063&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=574XgbVqNTENu3ZFuUIvW2zjimZuqYsJzBl1BPGoxIE&s=K9CrsVNCrAoJZUOfr4FdhOhwKTFpJqd4azP8cjs_1Zw&e=>
Conference tel: 18663127353,,8035340905#

From: "iqug-bounces at iqug.org<mailto:iqug-bounces at iqug.org>" <iqug-bounces at iqug.org<mailto:iqug-bounces at iqug.org>> on behalf of David Louie <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>
Date: Thursday, August 15, 2019 at 2:51 PM
To: "markdmumy at gmail.com<mailto:markdmumy at gmail.com>" <markdmumy at gmail.com<mailto:markdmumy at gmail.com>>
Cc: IQ Group <iqug at dssolutions.com<mailto:iqug at dssolutions.com>>
Subject: Re: [IQUG] [iqug] Sharding my largest IQ table approach

Hi Mark,

IQ 15.4 (but going to 16.0 next month).

Multiple users can see the data some with larger look back periods than others.  Adhoc so it can be anything.

We have data going back to 2000.

David

From: Mark Mumy [mailto:markdmumy at gmail.com]
Sent: Thursday, August 15, 2019 11:34 AM
To: Louie, David <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>
Cc: iqug at dssolutions.com<mailto:iqug at dssolutions.com>
Subject: Re: [IQUG] [iqug] Sharding my largest IQ table approach


External Email: Use caution with links and attachments
How do you plan on using the data?   A user can see it all?  Only for a period of time, like 1 year at a time?

Which IQ version?
Mark

========================

Sent from my mobile device

On Aug 15, 2019, at 10:15, Louie, David <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>> wrote:
Hi Folks,

We need to bite the bullet and shard our largest IQ table ( 248 billion rows, 30 TB) as load table times have gotten painfully slow.

So we will create a <table>_history and <table>_current,  our loader will load into the <table>_current> and we will create a new view <table> which does <table_history> UNION ALL <table>_current.

The <table>_history will be the copy of the entire table so we will start with an empty <table>_current (hence reaping the benefits of the smaller footprint and faster loads)

My question is from a view creation and IQ optimizer/performance perspective is this approach fine or would you recommend breaking the <table>_history into additional shards ( <table>_year for example)?

Thanks
David


This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/compliance/email-disclaimers for further information.  Please refer to http://www.blackrock.com/corporate/compliance/privacy-policy for more information about BlackRock’s Privacy Policy.
For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/about-us/contacts-locations.

© 2019 BlackRock, Inc. All rights reserved.
_______________________________________________
IQUG mailing list
IQUG at iqug.org<mailto:IQUG at iqug.org>
http://iqug.org/mailman/listinfo/iqug<https://urldefense.proofpoint.com/v2/url?u=http-3A__iqug.org_mailman_listinfo_iqug&d=DwMFaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=jPeuuWL3l5KwKBQa6_y5kLsluSbOaKnp4Jkz56UWBac&s=US3yxUzi7QKYWZdu4WHisu2OSSvL4F4o0LcaBX9_asY&e=>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://iqug.org/pipermail/iqug/attachments/20191014/e2b8bcd4/attachment-0001.html>


More information about the IQUG mailing list