Thursday, June 21, 2012

Oracle DATE Joins Fail for no Obvious Reason


Brief Description:  Attempting to join columns of the same DataType results in errors like the below, or produce no results when there are valid joins in the tables
ORA-01858:  a non-numeric character was found where a numeric was expected

image
Problem:
I had a mildly (okay, deeply) frustrating couple of hours being beaten up by this problem.   I had two tables, one a dimension with general date info and another a fact table with a couple of date columns.  All three columns showed as DATE datatype, but while column1 joined to the date dimension without issue, column2 failed explosively or returned no results and with a number of different errors depending on what variety of date conversion, no date conversion, convert to string, then to date, convert to date with the month spelled out...etc.  All dates looked the same format in query results (in my case using Toad).  My management studio was displaying in the painfully old 'DD-MON-YY' format (which may be an artifact of this particular database having been continuously upgraded for years).  Although I can't swear to what is actually happening under the covers, forcing the NLS_DATE_FORMAT for the session (or in the case of Toad, the startup script) fixed the problem for me.  This may not be a problem in SQL*Plus, but I've had my share of frustration with NLS_DATE in SQL*Plus, so it's possible it's also applicable there.
In Toad:
To fix for this session only:
ALTER SESSION SET NLS_DATE_FORMAT='MM-DD-YYYY';
To fix in Toad startup script:
  • On the ribbon menu go to View/Toad Options
  • Select Scripts
  • Under Login scripts edit glogin.sql
  • If this is the first time you've edited the script, add the text below under the comment block, otherwise, enter it below the last operation
  • BEGIN
       EXECUTE IMMEDIATE 'ALTER SESSION SET NLS_DATE_FORMAT=''MM-DD-YYYY''';
    END;
image

Tuesday, June 12, 2012

Least Helpful SSIS Error Message: DTS_E_PRIMEOUTPUTFAILED

Short Description:  An SSIS package throws this error on an OLEDB source component

[SSIS.Pipeline] Error: SSIS Error Code DTS_E_PRIMEOUTPUTFAILED.  The PrimeOutput method on component "My OLEDB component" (1729) returned error code 0x80004005.  The component returned a failure code when the pipeline engine called PrimeOutput(). The meaning of the failure code is defined by the component, but the error is fatal and the pipeline stopped executing.  There may be error messages posted before this with more information about the failure.

Unfortunately, the actual error is almost always downstream from the OLEDB datasource, but isn't always reported.  If you're very, very, lucky and the downstream component hits  the error condition before the OLEDB source fails the primeoutput, the actual error may be recorded earlier in the log/Execution Results.   It seems that the error basically means that somewhere downstream there is a component that either doesn't like, doesn't fit, doesn't understand, or is missing a condition to deal with the data returned by the OLEDB datasource.  Turning up logging (SSIS menu, select logging, check the failing dataflow and add a provider (make sure to check the box next to it to enable it), then decide how much detail you want on the details page)  may or may not result in the error being logged.  If it doesn't show up with verbose logging, I'm not sure of a better way than triple checking, and trial and error removing components to locate the problem.

Just in case it's helpful, here are some times I've run into this condition while writing packages:

  • Unhandled null in a downstream conditional split
  • Trying to retrieve an Oracle old-school, deprecated "Long" datatype with the Microsoft OLE DB data source -- spoiler alert:  it doesn't work...it looks like it's working, but treats it as a text stream and as far as I can tell it then fails once it gets to one that doesn't look like enough like a text stream.   I don't know how to fix this other than phasing out datatypes deprecated last millennium.
  • Constraint violation (in this case, not null) on a downstream component.

Obviously, these conditions should cause the package to fail, but a more clear error message would be a real time saver.

Thursday, June 7, 2012

SSIS OLE DB task very, very, slow updating a small number of records

Description:  an SSIS OLE DB task (in this case an update) with only a few thousand records to update takes a totally, unbelievably, long amount of time to run against a large-ish (~6 million rows) table.

My Scenario:  Writing an SSIS package for ETL against a fairly large Fact Table with only a few thousand changes per day.  Records since the latest audit date are picked up by the task, then looked up and with a conditional split carved into Insert and Update.  Nothing fancy.  The insert (an OLE DB destination), runs quickly, but the update ran so long (Some unknown amount longer than 30min for less than 3k records) in test that I killed it because it was making me sad (also not feasible for our ETL window, but mostly the sad part).

The update (in my case) is pretty standard:  Check the numeric unique key (in our case, we don't enforce the source system primary key) against an (Oracle Date data type) audit field, and if it's newer, update it.  Full Disclosure:  because there are so many records and so few changes, I'm only pulling the fields more recent than the newest existing record in the destination table, but this isn't relevant to the update task problem (although it would cause exciting unnecessary work on the pull from source side).

The Problem:  Everyone else might already know this, but it didn't really hit me: when it says  "Executes an SQL command for each row in a dataset." it really means it.  I hadn't put a huge amount of thought into it before today, but until I really traced it and watched performance, I sort of had the impression it relied on some of the work already done by the preceding Lookup and Conditional Split tasks, and maybe sort of cached the lookups it had already done.  On a scale of 1 to finding out the the Colonial furniture appraiser from Antiques Roadshow is actually two twins, it's not an earth shattering revelation (you go your whole life believing the most shocking thing about colonial furniture is how much it would be worth if you'd retained the original "patina," and then you find out that guy is TWINS!).  The OLE DB command task is performing the updates 1 by 1 and thus doing a lookup against a big table also 1 by 1.  That's a whole bunch of work on 6 million records even though only 3 thousand need to be updated.

Skip My Rambling and Try This: While for truly enormous tables, it may be best to use a staging table approach, with a medium-big table the right indexes took it from an amount of time I gave up on finding out, to less than a minute.

Your Mileage May Vary (especially if your primary key isn't numeric):

A basic index on the audit date field and one on the key (if you're using a primary key, you already have this),  took the update down to less than a minute.  A combined index on audit field first/then the key also reduced the run time significantly (down to 10 minutes), but pales in comparison.