Wednesday, May 27, 2009

Scheduled Windows Server Restart Fails Intermittently

Symptom Brief Description:
Scheduled restarts of Windows Server 2003 fail intermittently when using shutdown.exe.
Problem:
A scheduled restart of a Windows 2003 server fails when using shutdown.exe.
Symptom:
Running shutdown.exe from a scheduled job or within a command prompt from a remote administration terminal services session fails with an error:
The computer is processing another action and thus cannot be shut down. Wait until the computer has finished its action, and then try again.(21)
An entry in the Scheduled Task log states
Result: The task completed with an exit code of (15).

Running shutdown.exe from the console or from a remote administration terminal services session connected to the console using the ‘/admin’ switch succeeds.
Solution:
Use the native tsshutdn.exe command instead of shutdown.exe. The tool is designed for application terminal servers, but the remote administration feature also uses terminal services. Tsshutdn will clear hung actions and successfully shutdown the server.

Tuesday, May 26, 2009

Oracle 10gR2 on SLES x64 requires 32-bit Runtime Environment

Symptom Brief Description: Problems with installing Oracle 10g on SLES 10.2 x64. Oracle logs indicate missing libraries.

Novell's 'Oracle Base' included with SLES 10.2 is pretty awesome, and my love for not having to trudge through the labyrinth of Kernel parameters and ever-changing and interdependent library versions is somewhere between my love for ice-cream and puppies. Novell also produces a very fine and thorough configuration document for installing Oracle on SLES (Novell's install doc for Oracle 10g on SLES 10.2 x64). In general, fantastic.

One issue I've run into, however, is that a crucial pre-requisite (32-bit runtime environment on X64 SLES) is easy to miss in the install documentation.

On the x64 version of SLES 10.2, the 32-bit runtime environment is required for an Oracle 10gR2 install. Without it, the install may fail silently, or it may fail like a crime scene, but it will fail (example errors below). The 32-bit runtime environment is shown as checked in the Novell documentation for a clean install, but it is easy to miss as it's in a screenshot that includes unnecessary or optional packages such as Print Server, File Server, AppArmor, and various Desktop environments.

Symptom:
Various install errors which may cause the Oracle Universal Installer to quit altogether, or to complete, but fail silently (until you look at the logs). The Logs will contain errors such as:
'/usr/bin/ld: crt1.o: No such file: No such file or directory'
and/or
Exception Name: MakefileException
Exception String: Error in invoking target 'install' of makefile
Problem:
Both the C/C++ tools, and the 32-bit runtime environment must be installed on x64 SLES for Oracle 10gR2. Without these packages, the install will fail.

Solution:
If you're doing an install from scratch in the graphic install: In addition to the Oracle Base package select:
  • 'C/C++ Compiler and Tools'
  • '32-bit runtime environment'

To check for the required packages on an existing system:
rpm -qa --queryformat '%{NAME}\n' egrep '(((^glib)+).*((32)+))'

Command should return the following required 32-bit packages:
  • glibc-locale-32bit
  • glibc-32bit
  • glib2-32bit
  • glib-32bit
  • glibc-devel-32bit
If these packages are not installed, they can be installed via rug, Yast2, or your install utility of choice.

Oracle 10g import function-based indexes error

Symptom Brief Description: Index creation on import fails with error 942, error message states that tables do not exist which do, in fact, exist.

I ran into this issue over the weekend...

When running an import, I found a number of logged errors indicating that the import could not create an index on a table because the table did not exist.

Fair enough -- it's perfectly reasonable that an index can't be created on a table that doesn't exist. The problem, of course, is that all of the tables referenced did, in fact, exist. This seemed a little 'un-sporting' to me, given the fact that I was looking right at the table(s) . After checking the obvious possible causes (listed below), I noticed that all of the indexes in question were function based which led to finding the answer on metalink. This is a somewhat confusing error message related to a 10.2.0.3 bug which has been fixed in 10.2.0.4. Details are below.


Symptom:

  • On import, some indexes fail on creation with the error below (ORA-00942).
  • On investigation, the table does, in fact, exist.
  • The indexes in question are function-based
Double quotes are produced by logging function and do not indicate a syntax problem in the original statement.

IMP-00017: following statement failed with ORACLE error 942:
"CREATE UNIQUE INDEX "indexname" ON "tablename" (UPPER("column1"
"ID") , "column2" ) TABLESPACE "tablespace" NOLOGGING"
IMP-00003: ORACLE error 942 encountered
ORA-00942: table or view does not exist

Things that should be ruled out:

  • Verify that the table exists (if it doesn't the error is telling you the truth)
  • Verify that the user running the import has permissions to the table.
  • Verify that the index is not referencing a foreign key which does not yet exist.
The actual problem:

The problem is addressed in
Oracle Metalink article/bug 5391326.
The problem is specific to the import of function-based indexes, run as SYS, on versions of 10gR2 less than 10.2.0.4.
If you're already on 10.2.0.4 this isn't your problem.

Workaround:

Run the import as a user other than SYS (with the required permissions).

Fix:

This is fixed in 10.2.0.4


Saturday, May 23, 2009

Listener problem with Oracle 10gR2 after upgrade to 10.2.0.4

Symptom Brief Description: TNS errors 12437, 12560, and Solaris Illegal Seek on listener startup.

We ran into this issue starting an instance after upgrading from 10.2.0.1 to 10.2.0.4 (on Solaris)

Symptom:
On attempting to start the listener the startup fails with the error below
TNS-12537: TNS:connection closed
TNS-12560: TNS:protocol adapter error
TNS-00507: Connection closed
Solaris Error: 29: Illegal seek

Things that may solve this problem in other cases, but didn't work for us:
  • Metalink: 315720.1 -- insufficient permissions to /etc/hosts/ file, or missing localhost entry
  • Metalink: 555609.1 -- excellent article with troubleshooting steps for this error
  • Metalink: 3168651 -- Manually relink

The Problem:

In this case, the LD_LIBRARY path contained a reference to a path containing InstantClient

What Fixed it for Us:
Reset the LD_LIBRARY path to the correct path and the listener started without issue.

Monday, May 18, 2009

Cut and paste between word and csh/ksh

Symptom Brief Description: Single quotes become periods when copied from Word to a csh/ksh session.

Here's another issue that briefly made me nuts.

Symptom:
Cutting and pasting commands between a Word document and a ksh or csh session resulted in single quotes being rendered as periods.

Things that are totally not the cause:
I wasted time assuming it was a unicode issue, even dropping it to notepad, messing around with the encoding, and running it through dos2unix. Do not bother with this! It is *much* more simple than that.

The actual problem:
Word Autocorrect will, by default, replace straight single quotes with smart quotes as you type. This is a problem for ksh or csh and results in them being interpreted as periods.

How to fix it:
To switch off autoformat for smartquotes:
  • Go to 'File', ‘word options’, ‘autocorrect’, ‘autoformat as you type’
  • Uncheck “replace straight quotes with smart quotes”

Any new documents will now default to straight quotes.

To fix existing documents:
  • Highlight one of the smart quotes
  • Hit ctrl-c to get it into the clipboard
  • Hit ctrl-h to bring up find and replace
  • Hit ctrl-v to enter the smart quote in the 'Find' box
  • Enter a straight quote in the 'Replace With' box
  • Hit 'Replace All' to fix them all at once, or if you only want to replace examples of commands, iterate through each.

Sunday, May 17, 2009

Another Fedora 10 Atheros issue

Symptom Brief Description: Ath5k card on Fedora 10 does not come back from suspend.


I could write a book (a relatively short book, but a book nonetheless), on the sheer number and variety of frustrations I've had with the Atheros 5K wireless card on Fedora 10. After initially getting it working, I've lost the card after about 30% of the OS updates I've applied (often to new and interesting issues). There are endless (very good) articles available on the myriad issues with installing the card, or with getting it to connect to a WPA AP. Most of these are more articulate than I would be on these well-documented issues, and can be easily located by Googling the error message, or hitting the Fedora Forum.

I would, however, like to spare others my most recent unnecessary frustration with this card on FC10.

Symptom:
After the last round of updates that I mustered the courage to apply, the wireless card stopped coming back from suspend. The primary symptom I saw was being repeatedly asked for authentication, which, in turn, repeatedly failed. I began to doubt I knew my the password for my AP.

At the time, I couldn't find anything on the issue. In the last couple of months there appear to have been more posts, and Redhat have posted a bug on it. Their recommendation appears to be upgrading to Fedora 11 (which I have not done, nor have I seen verification that this solves the problem). No workaround is posted.

Error Messages:
A tail of the message log showed the following errors.

May 17 16:39:49 localhost kernel: ath5k phy0: gain calibration timeout (2422MHz)
May 17 16:39:49 localhost kernel: ath5k phy0: ath5k_chan_set: unable to reset channel (2422 Mhz)
May 17 16:39:50 localhost NetworkManager: wlan0: link timed out.
May 17 16:39:50 localhost kernel: ath5k phy0: gain calibration timeout (2427MHz)
May 17 16:39:50 localhost kernel: ath5k phy0: ath5k_chan_set: unable to reset channel (2427 Mhz)

As the error clearly says it is "unable to reset channel." This sounds like something a reboot should fix. A reboot, however, results only in:

ath5k phy0: unable to reset hardware: -11

Things you probably shouldn't bother doing:

In trying to get back up and running, I spent unnecessary time beating my head against a wall -- due primarily to my own impatience. I went on a wild goose chase of blind troubleshooting. All of the below were useless in fixing the issue:

  • Warm reboot
  • Hard shutdown allowing only a brief 'off' time.
  • Reinstalling the driver,
  • Attempting to backrev the driver,
  • Attempting to go back to the old madwifi driver,
  • Assuming it was a WPA issue and generally digging myself into a deeper hole.
Do not do these things! I lost three 'Alice in Wonderland-esque' hours of inexplicable frustration 'down the rabbit-hole' (so to speak).

How to 'fix'/workaround it:
Eventually, I shut down out of frustration and came back later. Naturally, this was what solved the issue. It appears that when you shut down, you really have to mean it, and give it a few minute to be sure it's learned its lesson.

I've run into this a couple of times since then, and a reboot, even completely shutting down for a minute or so, does not fix it. Shutting down for several minutes is the only thing that's got me back up and running. I've also found that I can restart in only a couple of minutes if I go the extra mile of removing power and battery. This may be overkill. It may also be the case that it's not necessary to physically leave the room in order to shame my laptop into cooperation. You may think it 'ridiculous' to go so far as to put your laptop in the dog's crate to give it a time out I, however, feel these are all viable (and somewhat satisfying) options.

The actual problem/anticipated long-term fix:
Of course, this is not a satisfactory solution (the dog needs his crate back). If FC11 really does solve the issue, I may eventually become optimist enough to move forward without expecting the same Sysiphusian drudgery (with regard only to the Atheros card -- I've been very happy with FC8, 9, and 10 otherwise and appreciate the work done to maintain a free O/S) on the new release. Until then, I'll be shutting down, crossing my fingers, and hoping for an interim fix.

Friday, May 15, 2009

Configuring iSQL*Plus for web dba access (Linux)

Brief Description: iSQL*Plus/dba can be trickier to configure than the non-dba iSQL*Plus web interface.

Configuring iSQL*Plus for web dba access was tricky for me as an Oracle beginner.

Issue:
I had difficulty finding straightforward instructions on getting it up and running for use on the dba url (basic iSQL*Plus worked as soon as I enabled it and opened the port). By default it should be installed but disabled with a base install of Oracle 10gR2. For complete SQL*Plus configuration information see Oracle's SQL*Plus configuration document.


How to:
The steps that got me up and running are below (I'm on SLES 10.2 x64, but this should apply generally to other distributions), any corrections are welcome.

Note: Some services need to be restarted. Don't do this on a production server without following your company's testing/change control policies.
  • Configure iSQL*Plus for autostart and allow access.
    • Enable iSQL*Plus to start on boot
      • edit etc/sysconfig/oracle and set START_ORACLE_DB_ISQLPLUS = yes
        • sed command is
          • sed -e '/DB_ISQLPLUS/ s/"no"/"yes"/' -i /etc/sysconfig/oracle
      • If you're running a local firewall, open port 5560 for iSQL*Plus
        • To open port on SLES 10.2 (requires root or sudo privileges)
          • With root or sudo privileges
            • edit /etc/sysconfig/SuSEfirewall2
            • add ports to FW_SERVICES_EXT space separated, well known ports can be listed by name (i.e. ssh, http).
              • example: FW_SERVICES_EXT_TCP="5801 5901 ssh 5560 1158 1521"
            • restart firewall services
              • /sbin/SuSEfirewall2 stop
              • /sbin/SuSEfirewall2 start
    • set a temporary JAVA_HOME (10g explicitly unsets JAVA_HOME to avoid conflicts for Oracle Java apps).
      • export JAVA_HOME=$ORACLE_HOME/jdk
    • Verify that the application server process is running
      • ps -eaf egrep "java.security.properties=/ "
    • Start the JAZN (Oracle's Java AuthoriZatioN ) Shell
      • Make sure you're in the working directory
        • $ORACLE_HOME/oc4j/j2ee/isqlplus/application-deployments/isqlplus
      • If this is a new install the default admin for iSQL*Plus will be 'admin' with a password of 'welcome' if this is an existing install, consult your records or another DBA.
      • To start with default user/pass (Do not leave it as default)
        • $JAVA_HOME/bin/java -Djava.security.properties=$ORACLE_HOME/sqlplus/admin/iplus/provider -jar $ORACLE_HOME/oc4j/j2ee/home/jazn.jar -user "iSQL*Plus DBA/admin" -password welcome -shell
      • If password was the default and you are the owner of the system/database, change the iSQLPlus Admin password from JAZN
        • setpasswd "iSQL*Plus DBA" admin old_password new_password
    • Enable iSQL*Plus Web DBA Access
      • Once logged into the shell
        • Create a user
          • adduser "iSQL*Plus DBA"
        • grant the webDba role
        • grantrole webDba "iSQL*Plus DBA"
    • Changes will not take effect until iSQL*Plus is restarted (this affects only the availability of iSQL*Plus, database availability should not be affected).
      • To restart iSQL*Plus
        • $ORACLE_HOME/bin/isqlplusctl stop
        • $ORACLE_HOME/bin/isqlplusctl start
    • iSQL*Plus should now be ready for access
      • the default urls for access are
        • http://:5560/isqlplus (standard)
        • http://:5560/isqlplus/dba
      • Log in as the new user.

Thursday, May 14, 2009

Toad for Oracle TNS Names Path and Editor issues on Windows XP

Brief Description -- Symptoms/Quirks:
Couldn't use TNSNames to connect, and TNSNames Editor was greyed out


Today I ran into much tnsnames related frustration configuring and getting information out of Toad for Oracle on Windows XP.

The Actual Problem:

Issue appears to be due to Toad looking for tnsnames.ora in the main client directory (as in InstantClient) as opposed to the \network\admin directory (for a full client install). Resolved by setting TNS_ADMIN environment variable.


How to:
An Oracle Client install is required before installing Toad (for tnsnames connectivity). The upshot is that the small amount of documentation that there is, applies to the Oracle Instant Client and not to the Full Client, and Toad help does not appear to have been updated to reflect changes in the latest release. My problems turned out to be minor issues, but since the documentation is scant and makes a lot of assumptions, I've listed basic install/configuration info below.

These steps apply to Toad for Oracle 9.7.2 and Oracle Client 11.1 on Windows XP
  • If you do not already have an Oracle Client installed download the Instant Client
    • It will also be helpful to download some of the associated extensions to the package such as SqlPlus and ODBC
  • If you do not already have source for a licensed version of Toad, download the evaluation version
  • Install and/or configure the Oracle Client
    • If using InstantClient
      • Extract the .zip file to a local directory (there are no requirements around the top level location, use one that makes sense to you, but do not change the name of the subdirectories it exports).
      • Verify that the tnsnames.ora is in the working directory of the base of the extracted directory.
        • If it doesn't exist and this is a new install create a text file 'tnsnames.ora'
        • If it doesn't exist and you have a full client install, or an existing InstantClient install, locate the tnsnames.ora (default location is %ORACLE_HOME%\network\admin) and make a note of it.
    • For both the full client and the InstantClient you will need to set environment variables
      • Right click 'My Computer' and select 'Properties'
      • Select the 'Advanced' tab and click the 'Environment Variables' button
      • In 'System Variables' locate and select the PATH variable.
      • Click the 'Edit' button and verify the existence of, or append the client directory location to the end of the path.
        • To add, append a semi colon followed by the path to the client directory (e.g. ;c:\instantclientdir)
      • If you are using the full client, and the tnsnames.ora file is in a location other than the working directory, select 'New' to add a system variable 'TNS_ADMIN'
        • In 'Variable Name' enter TNS_ADMIN
        • In 'Variable Value' enter the directory location of the tnsnames.ora file. (e.g. c:\instantclientdir\network\admin)
      • Verify paths are set (both Instant Client and full Client)
        • In a cmd window type
          • echo %PATH%
          • This should return your normal path with the location of the client files appended.
      • Verify TNS_ADMIN variable (Full Client and clients with existing tnsnames.ora)
          • In a cmd window type
            • Echo %TNS_ADMIN%
            • This should return the location of the tnsnames.ora
  • Install Toad
      • From the Toad install media or download kick off setup
        • No changes from defaults are required during the actual install, choose components and file locations as best suits your environment and licensing.
  • Configure/Verify Toad
    • With the Oracle Client installed and configured, Toad should be able to locate the settings it needs
    • To verify select 'Session' new connection. In the 'Installed Clients' box (bottom right)
      • Verify that a client is listed
      • Click the elipses button to the right and verify that there are no errors listed (they will be red).
        • If values are incorrect, it is most likely an incorrect path in the environment variables.
          • From a cmd window type
            • env
            • Check that the path and TNS_ADMIN variable are correct, and verify there aren't stray oracle client variables with incorrect values (e.g. an incorrect ORACLE_HOME). Be very careful about changing these values if you have an existing client install.
            • Missing values not related to the ORACLE_HOME are not an issue if you have client only install (you don't need a SID etc. unless you have a local database).
    • To verify/configure the TNSNames Editor Utility:
      • On the 'Utilities' menu select TNSNames Editor.
        • If it is greyed out on the menu there is still a path issue (see above)
        • If you created a blank file it will be empty and entries can be added either by editing text directly or by generating entries from direct connections.
          • To add an entry click the 'Add Service' button (looks like a shiny blank sheet)
          • Enter connection information.
            • To enter the SID as SID instead of Service, check 'Use Oracle 8 and previous identification (SID)'
    • To add Oracle Utilities to Toad
      • On the 'View' menu select 'Toad Options'
      • In the left hand pane, Select 'Executables'
      • Enter the path for any Oracle Utilities you have installed or downloaded.
Thanks to John P. from Quest who provided some as listed links below:

Monday, May 11, 2009

SLES VMware Templates with OSP VMware Tools

I've been working on supporting Novell SUSE Linux Enterprise Server 10.2 SP2 (64-bit) on VMware 3.5U3.  The standard VMware tools installation from within Virtual Center is difficult to support because kernel patches may require a recompiliation.  Recently, VMware released precompiled tools known as Operating System Packages.  These precompiled binaries support installation and updating using the native operating system tools--in this case rug.
Here is a simple process to subscribe to the OSP tools:
mkdir /tmp/gpgkey
cd /tmp/gpgkey
wget http://packages.vmware.com/tools/VMWARE-PACKAGING-GPG-KEY.pub
rpm --import /tmp/gpgkey/VMWARE-PACKAGING-GPG-KEY.pub
rug service-add --type=YUM http://packages.vmware.com/tools/esx/3.5u3/sles10/x86_64 vmware-tools-collection
rug subscribe vmware-tools-collection
rug install -y -c vmware-tools-collection vmware-tools
Update all vmware tools is simple at this point:
rug update vmware-tools-collection
As is removal of all vmware tools:
rug remove vmware-tools-* open-vm-tools-*
I have come across one issue with the use of the OSP tools which I am researching with the help of VMware support: templates seem to fail when using OSP.  Changing out the tools to the native Virtual Center tools allows the template to succeed.  The error message is "Customization of the guest operating system 'sles10_64Guest' is not supported in this configuration..."
Hopefully this is just some simple installation mistake.  If not, I may be stuck with the Virtual Center vmtools for now.