Wednesday, December 7, 2011

SSIS Derived Column Expression for Converting Unix int Epoch Dates to datetime

 
For source systems that use milliseconds since the beginning of the “Unix Epoch” for dates (some Unix, Java etc.), I was having some trouble converting them in SQL Server Integration Services with a Derived Column.  The goal, as always, is to end up with a usable datetime data type rather than a (still useful for humans but a pain for systems), string.
The expression I came up with is below, there may be a much more elegant way to do this, please feel free to correct me:

SSIS Expression:
DATEADD("DD", ((DT_I8)(<YOURUNIXDATECOLUMN>/1000/60/60/24)), (DT_DBTIMESTAMP)((DT_WSTR, 25)("1970-01-01 00:00:00.000")))
For good measure, here it is in PL/SQL and T-SQL

PL/SQL
SELECT TO_DATE(
        (
        TO_DATE ('1970-01-01 00:00','YYYY-MM-DD HH24:MI') + 
        (YOURUNIXDATECOLUMN/1000/60/60/24) 
        )
    ,'YYYY-MM-DD HH24:MI')
FROM YOURTABLE AS DATEFROMUNIX
T-SQL 

SELECT DATEADD(DD, (YOURUNIXDATECOLUMN/1000/60/60/24), '1970-01-01 00:00:00') FROM YOURTABLE AS DATEFROMUNIX



Obviously, you can divide by 86,000,000 instead, but I think it makes your code easier to read (clearly converting to days from milliseconds).  Also obviously, the T-SQL could be made into a UDF (on the subject of that, I remain cowardly neutral -- I do, however, have some pretty strong opinions about UGBs).
Note:  I cribbed the PL/SQL from Jive Software’s doc on converting their timestamps, so thanks for documenting that!
Also Note:  If I were giving out prizes for things that make my life difficult, datetime data types would be eligible for a truckload of factory-second tube socks, or slightly dented tins of peaches this week.

Mapping SSIS Data Conversion Transform Data Types to SQL Data Types

 

SQL Server Integration Services Data Conversion Transform Data Types don’t match up to SQL Server Data Types, which makes sense since SSIS isn’t just SQL to SQL.

I found this MSDN article really helpful, albeit, noting the caveat stated on the MSDN page: “These mappings are not intended to represent strict equivalency”

Pulled from the above article, my top 8 are below.

  • int – DT_I4
  • datetime – DT_DBTIMESTAMP (I don’t get it either)
  • nvarchar – DT_WSTR (give length separately)
  • varchar – DT_STR (give length separately)
  • datetime2 – DT_DBTIMESTAMP2 (use for Oracle DATE/TIMESTAMP compatibility, dates less than 1753 etc)
  • bit – DT_BOOL
  • float – DT_R8
  • bigint – DT_I8

Monday, December 5, 2011

Can’t Believe I Didn’t Know This One

I’m endlessly frustrated by working with date and datetime types in T-SQL and PL/SQL, so I often end up complicating myself into a corner working with them. 

I’m running the risk of another “duh” comment by posting something so simple, but the syntax in T-SQL and PL/SQL to get the date of a year, or any number of days from today’s date is just:

T-SQL:  SELECT GETDATE() - 365;

PL/SQL: SELECT TO_DATE(CURRENT_DATE) - 365 FROM DUAL

Thursday, November 17, 2011

Quick Query to Look for Columns in a DB

 

Nothing fancy, just a quick query for ‘stab in the dark’ hunts for data.   I get that when you’re at the point that you’re hoping someone named a column helpfully, you’ve got bigger problems;  nevertheless I needed to throw it together.

Find a string in the list of column names in a database and return table and datatype information.

DECLARE @COLUMN_NAME_TEXT_TO_FIND NVARCHAR(255)
SET @COLUMN_NAME_TEXT_TO_FIND = 'SOMETHING'


select COL.NAME AS COLUMN_NAME, OBJECT_NAME(COL.ID) AS TABLE_NAME, XTYPES.NAME AS DATATYPE, UTYPES.NAME AS USER_DATATYPE, COL.PREC AS COLUMN_PRECISION
from SYS.SYSCOLUMNS COL
JOIN SYS.SYSTYPES XTYPES ON COL.XTYPE = XTYPES.XTYPE
JOIN SYS.SYSTYPES UTYPES ON COL.XUSERTYPE = UTYPES.XUSERTYPE
WHERE COL.NAME LIKE '%' + @COLUMN_NAME_TEXT_TO_FIND + '%'



With only user Data Types:




DECLARE @COLUMN_NAME_TEXT_TO_FIND NVARCHAR(255)
SET @COLUMN_NAME_TEXT_TO_FIND = 'SOMETHING'
select COL.NAME AS COLUMN_NAME, OBJECT_NAME(COL.ID) AS TABLE_NAME, UTYPES.NAME AS USER_DATATYPE, COL.PREC AS COLUMN_PRECISION
from SYS.SYSCOLUMNS COL
JOIN SYS.SYSTYPES UTYPES ON COL.XUSERTYPE = UTYPES.XUSERTYPE
WHERE COL.NAME LIKE '%' + @COLUMN_NAME_TEXT_TO_FIND + '%'




Friday, November 11, 2011

Things That are Awesome in a Sea of Fail –SharePoint List Adapter for SSIS

I’ll get to what’s awesome in a minute, but first I’d like to take this opportunity to tell SharePoint “it’s not me, it’s you.”   I recently transitioned from holding up SharePoint’s database back end, to part of my new job involving consuming SharePoint services.  Having been a giant fan of SharePoint infrastructure, I’m deeply hurt by the fact that, from the consumer side, it seems to make every little thing more complex, but also, more limited -- like the difference between Chrysler cars and the Chrysler Building – the product is functionally lacking but the house is the prettiest in the world (that analogy may have worked better pre-1950s when Chrysler actually occupied the building).   The HA integration in SP2010 is a thing of beauty; it’s very sophisticated, but not overly complex, and really, really, resilient.  Turn that sentence backwards, that pretty much sums up how I feel about trying to do even the simplest data related tasks from the consumer side.  Grammar aside, I really do mean the simplest tasks imaginable.   I can see why it would seem absurd that anyone would want to create a conditional drop down from a SQL DataSource…I probably would have thought that kind of functionality was overkill too (note:  if you’re using a list, rather than a Database as the DataSource, the jQuery libraries for SharePoint can help you with that).  It’s almost certainly due to my ignorance, but  I genuinely don’t get the use case for external content types simplifying anything, or reducing admin overhead, or closing any security holes, unless all your external content type sources are Kerberos integrated.  That seems like it defeats the purpose of giving power users access to that functionality.

Dear Power-User Pat in Purchasing: 
External content types are here.  You are endlessly empowered by them…but make sure you have a thorough understanding of delegation w/respect to your data…also, no Oracle unless you’re crazy experienced with Open Kerberos platforms…oh…you’re not?  Then you’ll need to have the SharePoint Admin build everything for you. 
Yours, Sally in SharePoint Services

You could fill a world with what I don’t know about external content types, so I would be thrilled to eat my words (although many of them are alliterative and poorly punctuated, so I’ll need ketchup or something).  Actually, two months ago I would have been thrilled; since then, SharePoint hurt my feelings one too many times and we have broken up.  I’ll never forget the good times with that amazing HA, and awesome PowerShell interface, and wicked DPM integration.  It was even cute when you appended everything with a GUID, and ignored the AutoGrow settings on the model DB if the Admin didn’t check with me before adding a content or service DB, or that weird thing you do silently with auto update statistics:  it made you quirky, like collecting vinyl or growing an ironic mullet.  I’ll always cherish the memory of how Highly Available you were, but I think HA is generally most useful when it applies to something I have a use for.  

For example:  I have a dog who has a thing for a particular pair of shoes.  She ate the heel off the left shoe of the pair, but I loved them so much I bought another pair.  She promptly ate the left heel off the new pair.  I was left with redundant, right-foot, patent-leather pumps.  They’re Highly Available, I’m totally set if I lose one, and they’re very pretty, but they are useless to me, and only serve as a cruel reminder of how great they used to be.  So what I’m saying, SharePoint, is if you don’t quit eating my shoes, maybe we should have a conversation about what your life was like before you landed in a pile of pillows and delicious shoes…actually, wait…I think I lost my train of thought.

Things that could also be considered Highly Available if the definition of HA didn’t have some sort of utility based bar to entry:

  • Computing architectures built around n+1 buckets of plastic marbles in order to ensure consistent availability of n buckets of marbles (note to self: make sure they’re kind of melted together a bit so no-one gets any ideas about designing a manual computing system with them, because that might be close to useful)
  • Ant density > 30:1  per any given square foot at a picnic (unless you are an anteater, in which case, it’s your lucky day and also, unlikely to stay HA for long)
  • Kardashians – I’m sure they’re nice girls; there are just so many of them and I still don’t know why they’re famous.


Mabel is an expert in HA footwear and would like to get paid in bacon
All that whining aside, I wanted to post a shout out/thank you to the people who wrote  the SharePoint list adapters for SSIS.  I know they’re not new, or even news, but in the few days before I archived this entry and decided not to post it (because I just sound bitter about my breakup with SharePoint, and also most people don’t share my deep concern for shoes and the Boxers who find them delicious) I had talked to several people who have been beating their heads against getting data from SharePoint as a WebService and were super excited that these adapters exist, so I’m posting a link to them (not that they’re hard to find, but some of us live in a cave).  After fighting all kinds of stuff that doesn’t quite work, it’s so great to find something that…just…works.    I have far less to say about them than SharePoint in General, because they do exactly what they say they’re going to do, nothing more, nothing less, and they just work.  Really, you’re just setting us up for disappointment when the rest of the world doesn’t live up to impossibly high expectations.   But yeah, mostly, just thank you!

It’s well documented on the codeplex site, but just to recap how ridiculously user-friendly they are.  You install the adapters, they show up as SharePoint List Source and SharePoint List Destination in your Data Flow tasks.  You’ll probably need to run it through a Data Conversion Transform, but from there, right into the Database!  Nice!  The red x and lack of flow in the example below are because my working packages show system specific info, so I figured maybe less with the screenshots of my production environment.  Once configured, it works like a charm!
image

SharePoint 2010 Application Crawl Store DB Grows Out of Control

Problem:  On SharePoint 2010, the Search Service Application Store DB grows out of control.  One table MSSCrawlURLLog is giant, and compared to the other tables, I mean really huge -- a couple of orders of magnitude bigger (MSSCrawlhostsLog may also be very large).

Reason:  Crawl logs are maintained for a very long time by default (it appears to be 30 days), there isn’t a way to configure this in the GUI.  Only errors are logged, and they’re logged with the frequency of the indexing crawl schedule.  This should be fine, but if you, for instance, have a pretty large farm and are migrating from 2007 to 2010, and you, say, installed the Fab40 templates, or have users who hardcoded URLs, these logs will get very large, very quickly.  There are a lot of postings about this issue, and the most recommended way to fix it is to reset the crawled content and re-crawl. With a large farm this is a search outage (search may be up, but you won’t get results until content is indexed), and takes a long time. I’m not crazy about this solution, but I haven’t found anything explicitly saying my fix is supported, so if in doubt, go with the reset.

One Fix:  In the Search Service Application Store DB (default name Search_Service_Application_StoreDB_funGUID), there is a procedure proc_MSS_CrawlReportCleanup.   It’s not super documented, and  I can’t find any way to call it through the GUI or PowerShell.  But it does exactly what you think it would do, and cleans up the offending tables.
Our environment is mirrored, so I wrapped calling the proc in the proc below that checks for mirroring primary first.  This shouldn’t be a problem on a standalone DB, but you could take it out for good measure if it doesn’t apply.  Pass the name of your crawl store database and the number of days you want to retain to the procedure.  It will generate a lot of IO if your table is growing a lot.  We run it off-peak.  Also, it’s not a bad idea to take a look at the error log and see if there are any errors that could be cleaned up since errors that don’t get logged, don’t have to be cleaned up.
USE [YOURUTILITYDB]

GO

/****** Object:  StoredProcedure [dbo].[PRUNE__SEARCH_CRAWL_STORE]    Script Date: 11/11/2011 16:29:13 ******/
SET ANSI_NULLS ON
GO

SET QUOTED_IDENTIFIER ON
GO


CREATE PROCEDURE [dbo]
.[PRUNE_SEARCH_CRAWL_STORE]
   
    (
   
--Pass the name of your crawlstore database as CrawlstoreDBName default is Search_Service_Application_CrawlStoreDB_somecrazyguid
     @CrawlStoreDBName
nvarchar(1000),
   
--@pruntime sets the number of days of error history to retain
     @prunetime
int
  
)

AS
BEGIN

declare @cmd nvarchar
(2000)

set @cmd = @CrawlStoreDBName +'.dbo.proc_MSS_CrawlReportCleanup '

-- For mirrored databases, the proc/sqlAgent job should live on both servers, and only execute on the primary
--Check the current status of the database because the command will fail if executed on a mirror database
if(select m.mirroring_rolefromsys.databases djoinsys.database_mirroring mon d.database_id = m.database_idwhere d.name = @CrawlStoreDBName) = 1
begin
  
--If the database is primary, run the log prune job
      
exec @cmd @prunetime
end

END
GO



SQL 2008 R2 Mirroring. Add a Witness After the Fact

Just a quick note:
We ran into a situation with a database already running mirroring without witness/failover to which we wanted to add a witness.  To add a witness server after the fact, we ran the mirroring wizard which appeared to be successful.  Failover, however, didn’t work, and the log on the secondary was full of permissions errors.
If you run all SQL services as the same user across your primary/secondary/witness, you won’t see this error…I’m not suggesting that’s a good idea.  In our case all the SQL services run as dedicated domain accts per server, and running the mirroring wizard doesn’t add the witness server account to the secondary if mirroring without witness is already configured.

There are two ways to fix this:

  • The right way:  Figure out how to assign permissions to the mirroring endpoint for the witness server service account on the secondary (I gave up on this because cheating is so much easier, and it’s not something we have to repeat).
  • The Kind of Cheating Way I Fixed It:  Create a dummy DB on the primary and secondary (don’t worry about the real mirroring setup, you just need the permission wizard to run).  On the dummy DB, run the mirroring setup as you would normally for synchronous mirroring with witness.  Drop the dummy DB.  Re-run the mirroring setup to add a witness on the primary database.  Test failover.