[IQUG] IQ Recovery point corrupt one of our LUNS for IQ

Louie, David David.Louie at blackrock.com
Mon Nov 6 10:51:20 MST 2017


Ron,

I actually have disk corruption and not index corruption and trying to get off a corrupt device.   Seeing no issue on IQ but IQ backup and RPA replication failing  due to the corrupt device.

THanks
David


From: Ron Watkins [mailto:rwatkins at dssolutions.com]
Sent: Monday, November 06, 2017 12:30 PM
To: Louie, David <David.Louie at blackrock.com>; 'Mumy, Mark' <mark.mumy at sap.com>; 'IQ Users Group' <iqug at iqug.org>
Subject: RE: [IQUG] IQ Recovery point corrupt one of our LUNS for IQ

Rather than empty file, I prefer to use rebuild index.
This “reads” indexes and “writes” them to RW dbspaces, skipping any readonly dbspaces.
It also has a “side-effect” of failing on corrupt indexes, so you will know which indexes are good and which are bad (if you don’t already know that from a DBCC).
Even when working with corrupt indexes, it may be possible to move “portions” of a corrupt FP index into a new table, thus preserving “some” of the data in a corrupt table. By using rowid() or other data-derived key you can select sections of data to a new table and delete them as you go, thus what’s left is the corrupt portions only.
I don’t know if you already dropped any advanced indexes, but if you have HG’s, you may find that you can re-create FP corruption from the HG if you have some missing data as well.
Ron

From: iqug-bounces at iqug.org<mailto:iqug-bounces at iqug.org> [mailto:iqug-bounces at iqug.org] On Behalf Of Louie, David
Sent: Monday, November 06, 2017 10:19 AM
To: Mumy, Mark; IQ Users Group
Subject: Re: [IQUG] IQ Recovery point corrupt one of our LUNS for IQ

At the advice of EMC we tried to migrate off the bad IQ device by running DD to copy the data from the bad disk to a new disk.   Upon bringing up the IQ server on the new device we ran sp_iqcheckdb allocation database and encountered a million corruption errors!

We then reinstated the ‘old bad device’ and was able to come backup clean ( had to fix one 1 table which reported FP index inconsistencies but then got a clean sp_iqcheckdb allocation database run).   Thankfully this worked!

SAP suggested we do the following before we tried the EMC suggestion:


1)    Put the bad dbfile in read only mode

2)    Create a new dbfile

3)    Run sp_iqempty file which will copy all the data from the read only ‘bad’ device to the new one

4)    Drop the bad device

This is basically doing at the IQ level what was attempted at the storage level.

We’ve done sp_iqemptyfile before on this IQ server but not when the underlying disk has corruption issues.

How risky is this ?  Will I again copy corruption to the new disk?

Anyone used this method to get off a corrupt disk before?

Thanks
David


From: Mumy, Mark [mailto:mark.mumy at sap.com]
Sent: Tuesday, October 31, 2017 12:33 PM
To: Louie, David <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>; IQ Users Group <iqug at iqug.org<mailto:iqug at iqug.org>>
Subject: Re: [IQUG] IQ Recovery point corrupt one of our LUNS for IQ

The tough part is what to do if the device is corrupt.  Some of this is based on how widespread the corruption is, for sure.  If it is widespread corruption then you may lose all the data on that file.  If it isn’t widespread, you may be able to save some or all.  All of that, though, assumes that EMC can mount all the secondary devices and give you a stable platform to start IQ in.  Lots of effort to save a backup, that’s for sure.

Mark

Mark Mumy
Strategic Technology Incubation Group
Customer Innovation and Enterprise Platform |  SAP
M +1 347-820-2136 | E mark.mumy at sap.com<mailto:mark.mumy at sap.com>
My Blogs: https://blogs.sap.com/author/markmumy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blogs.sap.com_author_markmumy_&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=6556gY7XgoZ51-t4n5uhGZpBORVhJ8BgEbdmTuwtRyw&s=_-LDfiA-zWwHI1h0OcYn1Q2PU6ZjGinK2GRPSsqBbSQ&e=>

https://sap.na.pgiconnect.com/I825063<https://urldefense.proofpoint.com/v2/url?u=https-3A__sap.na.pgiconnect.com_I825063&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=6556gY7XgoZ51-t4n5uhGZpBORVhJ8BgEbdmTuwtRyw&s=V3oLTq_edMBARumXv_yAj4c_nrVAqby5akfo-BGqlSo&e=>
Conference tel: 18663127353,,8035340905#

From: David Louie <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>
Date: Tuesday, October 31, 2017 at 09:07
To: Mark Mumy <mark.mumy at sap.com<mailto:mark.mumy at sap.com>>, "iqug at iqug.org<mailto:iqug at iqug.org>" <iqug at iqug.org<mailto:iqug at iqug.org>>
Subject: RE: [IQUG] IQ Recovery point corrupt one of our LUNS for IQ

Mark

We (DBAs) completely agree on this.   It definitely EMC corrupting the device and I’m guessing because IQ is backing  up blocks it ran into issues.   So we believe there is no data corruption to dbcc check for.  We suspected even if we can find the objects on the files system we may not be able to fix this corruption.

David

From: Mumy, Mark [mailto:mark.mumy at sap.com]
Sent: Tuesday, October 31, 2017 8:51 AM
To: Louie, David <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>; IQ Users Group <iqug at iqug.org<mailto:iqug at iqug.org>>
Subject: Re: [IQUG] IQ Recovery point corrupt one of our LUNS for IQ

I’m confused.  If EMC is encountering corruption, I don’t see how IQ (which operates at a higher level) can help.  EMC RP knows nothing of IQ and IQ structures.  It’s not an issue of IQ being corrupt and causing an EMC issue.  It seems more like an EMC issue that corrupted IQ.  I would also be surprised if EMC would allow you to use the device.  If it’s been marked as corrupted by EMC, are you saying that you can actually still use the device?  If you can actually mount and see all the secondary devices, then why not do a low level copy of production?  Use something like ‘dd’ to copy the devices over.  This can all be done from within a virtual backup script and would bypass RP.  But if RP is corrupted, will a low level IQ copy, or any work in IQ, actually work?  That’s where I get confused as you’re bypassing the lower level (EMC) and thinking that IQ can actually fix an EMC problem.  If IQ could fix the problem, so could any OS level utility to copy the raw devices.

I would absolutely get a full backup done to filesystem right now.  You are working without a safety net which could be devastating to your system if something else were to happen to IQ.  I would back up to a filesystem then implement an incremental backup strategy.  Do all this until EMC can fix the issue.

Mark

Mark Mumy
Strategic Technology Incubation Group
Customer Innovation and Enterprise Platform |  SAP
M +1 347-820-2136 | E mark.mumy at sap.com<mailto:mark.mumy at sap.com>
My Blogs: https://blogs.sap.com/author/markmumy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blogs.sap.com_author_markmumy_&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=N6lWt8Q8aVMuWeIuZ_nX0xf-ze3U2yjroTc9w7q13t4&s=lQJFux7V-D3ht0vQFOIUmostLSDs-yukTWa5ZmFmp-M&e=>

https://sap.na.pgiconnect.com/I825063<https://urldefense.proofpoint.com/v2/url?u=https-3A__sap.na.pgiconnect.com_I825063&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=N6lWt8Q8aVMuWeIuZ_nX0xf-ze3U2yjroTc9w7q13t4&s=tT1t488ROCtXLKnggV89TFNizPUS-C89dCdDe6djsFI&e=>
Conference tel: 18663127353,,8035340905#

From: David Louie <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>
Date: Tuesday, October 31, 2017 at 07:40
To: Mark Mumy <mark.mumy at sap.com<mailto:mark.mumy at sap.com>>, "iqug at iqug.org<mailto:iqug at iqug.org>" <iqug at iqug.org<mailto:iqug at iqug.org>>
Subject: RE: [IQUG] IQ Recovery point corrupt one of our LUNS for IQ

Hi Mark,

Yes we are using a virtual backup.   The primary site seems ok but we cannot back it up and encounter the device corruption so we have no full backup.   The recovery point can no longer replicate to our DR because it encounters the corruption on the device too.

So we are in limbo here.   The EMC guys are looking into a fix on the storage side but we’ve been ask to see what we can do on the IQ side on repairing the corruption.

We were thinking if there was a way to id all objects on the dbspace/file which is showing corruption and run dbcc on it to detect if the corruption is being see on the IQ side.   I know we can run sp_iqdbspaceobject but that only shows the dbspace level and not the file level.


Thanks
David

From: Mumy, Mark [mailto:mark.mumy at sap.com]
Sent: Monday, October 30, 2017 1:56 PM
To: Louie, David <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>; IQ Users Group <iqug at iqug.org<mailto:iqug at iqug.org>>
Subject: Re: [IQUG] IQ Recovery point corrupt one of our LUNS for IQ

David,

Is your primary side OK?  Are you using Virtual Backup to engage Recovery Point?

Mark

Mark Mumy
Strategic Technology Incubation Group
Customer Innovation and Enterprise Platform |  SAP
M +1 347-820-2136 | E mark.mumy at sap.com<mailto:mark.mumy at sap.com>
My Blogs: https://blogs.sap.com/author/markmumy/<https://urldefense.proofpoint.com/v2/url?u=https-3A__blogs.sap.com_author_markmumy_&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=cRN_OP4zR9ZYxEBLXv7YnENuoanh_mmdxxAAIS331BE&s=ute7FhlUquwzU5XsfwD1VwAW7wWuzRVYY0DdJqGvfL8&e=>

https://sap.na.pgiconnect.com/I825063<https://urldefense.proofpoint.com/v2/url?u=https-3A__sap.na.pgiconnect.com_I825063&d=DwMGaQ&c=zUO0BtkCe66yJvAZ4cAvZg&r=hvVkP5y30w3_fHBwXsoZ2pqbwQXietniE01Z666b0tE&m=cRN_OP4zR9ZYxEBLXv7YnENuoanh_mmdxxAAIS331BE&s=K-bokwA2VPXsOZUGycbll_bS6znJac73PALKOvHhkik&e=>
Conference tel: 18663127353,,8035340905#

From: "iqug-bounces at iqug.org<mailto:iqug-bounces at iqug.org>" <iqug-bounces at iqug.org<mailto:iqug-bounces at iqug.org>> on behalf of David Louie <David.Louie at blackrock.com<mailto:David.Louie at blackrock.com>>
Date: Monday, October 30, 2017 at 12:33
To: "iqug at iqug.org<mailto:iqug at iqug.org>" <iqug at iqug.org<mailto:iqug at iqug.org>>
Subject: [IQUG] IQ Recovery point corrupt one of our LUNS for IQ

Hello All,

We are in a situation where we are running EMC recovery point and it corrupted one of IQ dbspace Luns.

The corruption seems to be on the disk side (we suspect it’s block level corruption)  and unfortunately this is preventing us from backing up the production database.  The backup fails when it detects the bad dbspace.

Has anyone run into this issue?

Also will creating a new lun and using the relocate from the bad lun to the new lun possibly fix this or will we only be copying the corruption from one place to the next?

Thanks
David




This message may contain information that is confidential or privileged. If you are not the intended recipient, please advise the sender immediately and delete this message. See http://www.blackrock.com/corporate/en-us/compliance/email-disclaimers for further information.  Please refer to http://www.blackrock.com/corporate/en-us/compliance/privacy-policy for more information about BlackRock’s Privacy Policy.
For a list of BlackRock's office addresses worldwide, see http://www.blackrock.com/corporate/en-us/about-us/contacts-locations.

© 2017 BlackRock, Inc. All rights reserved.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://iqug.org/pipermail/iqug/attachments/20171106/84baa283/attachment-0001.html>


More information about the IQUG mailing list