Tuesday, December 6, 2011

HP DL140 with SATA disks: very slow I/O

An HP DL140 G3 I was repurposing originally came with SAS disks, but I decided (ha! ha!) to use inexpensive 2TB SATA disks instead, on the rebuild. I rebuilt using CentOS 5.7. When the system came up, iostat was consistently giving iowait in the low teens, even when the machine was more or less idle, and heavy writes showed speeds of about 7.6 MB/s.

WTF.

I updated the system ROM and storage controller firmware - no luck. The DL140 comes with this horrible arcane RAID software - it only supports a RAID 0 or RAID 1 config - so I tried both and neither - nothing seemed to help. (Yes, this was a pretty labor-intensive process.) The controller, at least, does properly identify the disks as SATA. Eventually I reinstalled CentOS with no hardware RAID, letting CentOS do a software RAID, because the performance seemed literally identical regardless.

Using hp's (RedHat) driver didn't seem to matter in any of these cases. Per this old thread, I found that I was already using ata_piix instead of the generic IDE driver. I believe I had exhausted all the DMA enabling/disabling options in the BIOS during the BIOS updating spree (but I suspect I do not fully understand everything in this thread).

Lsiutils couldn't control the RAID - naturally, because there was no RAID - but I was able to enable a write cache using sdparm instead. sdparm --set=WCE /dev/sda made the average I/O immediately leap up to around 20 MB/sec, which was suddenly usable. I experimented with schedulers a bit, and found that noop was a little more consistent in keeping the I/O above 21 MB/sec - cfq was sometimes down to 18.

This speed is still pretty crappy. I can only assume that there's still some configuration setting that can make this better. There may be some hardware issue too, ultimately - I remember reading someplace that using 6 gb/sec SATA disks with a motherboard that can only support 3 gb/sec SATA disks is actually slower than using 3gb/sec sata disks, though I can no longer find this reference.

Wednesday, September 21, 2011

ArcSDE / Oracle Cannot Initialize Shared Memory

Someone (not me! ha ha!) rebooted a server running ArcSDE 9.3 with local Oracle 10g, and afterward ArcSDE was giving these errors on startup:
ERROR in clearing lock and process tables.
Error: -51
DBMS error code: -12704
Error PL/SQL block to clean up hanging entries 
ORA-12704: character set mismatch
 
ERROR: Cannot Initialize Shared Memory (-51)

... so... I guess... the character set is wrong in Oracle? BUT NO IT'S CORRECT.
Or there's some permissions issue? NO THAT'S NOT IT EITHER.
The issue was fixed, though this is not the recommended solution, by deleting all the files owned by the sde user in /tmp/ - in this case, just the helpful socket files:
/tmp/SDE_9.3_esri_sde_iomgr_shared_semaphore  /tmp/sde_server_to_client_FIFO_esri_sde_0
/tmp/sde_client_to_server_FIFO_esri_sde_0     /tmp/sde_server_to_client_FIFO_esri_sde_1
/tmp/sde_client_to_server_FIFO_esri_sde_1     /tmp/s.esri_sde.iomgr

Users experienced some instability until ArcSDE was able to regenerate all the files, which happened after about 20 minutes. So even though this was a bad thing to do and should not have worked anyway, I offer it to you, o interwebz, in case you get as desperate as I did. At the very least, perhaps it will be reassuring to be able to find "DBMS error code: -12704" in Google. It's not just you!

Useful references I found: Administering ArcSDE for Oracle (pdf)

Thursday, September 15, 2011

Xen "blocked for 120 seconds" I/O issues on G7 blades

The issue manifests as crazy load during times of high I/O, with a very high amount of "dirty" memory in /proc/meminfo, and messages like "INFO: task syslogd:1500 blocked for more than 120 seconds" in dmesg that reference fsync further down the stack. It seems to primarily affect HP CCISS disk arrays. I'm mostly concerned about it in RedHat/CentOS, but it may be a bug for users of other distros too. When I wrote about this issue before, I was using G6 HP BL460c Blades - now I'm using G7 blades as well.

RedHat and others claim that you just need to update the driver to the latest from HP, but this emphatically did not work for me. What does seem to work, for both the G6s and G7s, is a band-aid solution that puts a ceiling on I/O throughput and uses the noop scheduler. Additionally, for G7s, the cciss kernel module available from EL Repo seems to help (though the HP driver is still ineffective, as far as I can tell). With a 50MB/sec max in place, xen guests seem to consistently be able to write at about 47-49MB/sec. Without the max in place, xen guest I/O can be as low as 2MB/sec.

As of this moment, the stable configuration for Blade G7s seems to be:

Blade
CentOS 5.7
kernel - 2.6.18-238.12.1.el5xen (Tue May 31 2011! So old!)
with added cciss module (explained below)
using noop for scheduler (explained below)
/proc/sys/dev/raid/speed_limit_max set to 50000

Xen Guest
CentOS 5.7
kernel - 2.6.18-274.el5xen
using noop for scheduler (explained below)
No additional kernel modules, no need to set /proc/sys/dev/raid/speed_limit_max

How to... Add cciss module from EL Repo
Previously, changing the driver never seemed to help anything, but a particular version of the cciss driver that ultimately derives from this project, available at EL Repo, seems to help with the G7 blades.

modinfo cciss  # see what you have now
rpm --import http://elrepo.org/RPM-GPG-KEY-elrepo.org
rpm -Uvh http://elrepo.org/elrepo-release-5-3.el5.elrepo.noarch.rpm
yum --enablerepo=elrepo-testing install kmod-cciss-xen.x86_64
modinfo cciss
 # info should have changed

My G6s are currently running without this module added and no I/O issues, so I have no advice one way or the other whether to use this on your G6s.

How to... set /proc/sys/dev/raid/speed_limit_max
To set it temporarily, just until you reboot:
echo "50000" > /proc/sys/dev/raid/speed_limit_max

To set it permanently, taking effect after you reboot, add the line
dev.raid.speed_limit_max = 50000
to /etc/sysctl.conf.

How to... Use noop for the scheduler on both blade and Xen guests
You can temporarily set the I/O scheduler on your machine with:
echo noop > /sys/block/[your-block-device-name]/queue/schedule

Long-term, edit /etc/grub.conf with elevator=noop so that the scheduler is always set on startup:
        title CentOS (2.6.18-274.el5xen)
        root (hd0,0)
        kernel /vmlinuz-2.6.18-274.el5xen ro root=/dev/VolGroup00/LogVol00 elevator=noop console=xvc0
        initrd /initrd-2.6.18-274.el5xen.img

What's the deal with the scheduler? Noop performs fewer transactions per second in exchange for being less of a burden on the system. Wikipedia lovingly calls it "the simplest"I/O scheduler. It's not clear to me if the reason this works is that it has fewer moving parts, as it were, to foul up with the driver, or if it's just slower, so it's working like the ceiling on raid/speed_limit_max.

Wednesday, August 24, 2011

open list of urls in a bunch of tabs on chrome on os x

I put the urls in a cr-delimited list in a textfile, "url-list.txt".

Initialize a fakey user data dir for Chrome to use:
$ /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --user-data-dir=/tmp/
(Chrome will pop up... pick a search engine and all that nonsense, then hit ctrl+c when done)

Open all the urls in a bunch of tabs:
$ cat url-list.txt | xargs /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome --user-data-dir=/tmp/

If you don't specify a user data dir, and you have Chrome open (of course you do) you'll get an "Unable to obtain profile lock" error message. Also, you may want to keep this list of urls separate from your user profile.

If you don't want to keep the urls separate from your user profile, exit Chrome first, and just use:
$ cat url-list.txt | xargs /Applications/Google\ Chrome.app/Contents/MacOS/Google\ Chrome

Thursday, August 11, 2011

lazy/easy NFS share on CentOS

There are many guides available online for making NFS shares. But where is the guide for the lazy and stupid person like me? It is here! Hooray!

On the Server...

  1. Install packages, if missing: yum install portmap nfs-utils nfs-utils-lib
  2. Add line to /etc/hosts.allow as needed, e.g.: portmap: nnn.nnn.0.0/255.255.255.0, to allow other servers in your local network to run portmap against you. You can also allow this on an ip-by-ip basis, e.g. portmap: nnn.nnn.nnn.nnn, nnn.nnn.nnn.nnnn, or you can use wildcards. (more info) Note: if wildcards/etc don't work for you at first, try single IP addresses
  3. Add lines to /etc/exports specifying the directories you want to share and the hosts to which you want to share them. E.g.:
    /directory/to-share	machine.ip.ad.ress(options)
    /somedir/specific-machine	nnn.nnn.nnn.nnn(rw,no_root_squash,sync)
    /somedir/couple-machines	nnn.nnn.nnn.nnn(ro)	nnn.nnn.nnn.nnn(rw,no_root_squash,sync)
    /somedir/entire-network		nnn.nnn.0.0/255.255.255.0(rw)
    /somedir/wildcards		nnn.nnn.nnn.2*(rw,sync)
  4. Are you also using iptables? If so, you'll want to open up a bunch of ports, and also edit some of the nfs settings to restrict the ports NFS is using.

    In /etc/sysconfig/nfs you'll want to set:
    STATD_PORT=10002
    STATD_OUTGOING_PORT=10003
    MOUNTD_PORT=10004
    RQUOTAD_PORT=10005


    In /etc/sysconfig/iptables you'll want to set something like:
    -A RH-Firewall-1-INPUT -p udp -m udp -m multiport --dports 111,1110,2049 -j ACCEPT
    -A RH-Firewall-1-INPUT -p tcp -m tcp -m multiport --dports 111,1110,2049 -j ACCEPT
    -A RH-Firewall-1-INPUT -p udp -m udp --dport 32769 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 32803 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 10002 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 10003 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 10004 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 10005 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 10006 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m tcp -p tcp --dport 10007 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m udp -p udp --dport 10002 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m udp -p udp --dport 10003 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m udp -p udp --dport 10004 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m udp -p udp --dport 10005 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m udp -p udp --dport 10006 -j ACCEPT
    -A RH-Firewall-1-INPUT -m state --state NEW -m udp -p udp --dport 10007 -j ACCEPT
    And then of course service iptables restart.
  5. Set services to start automatically, because we're lazy: for i in nfs portmap; do chkconfig $i on; done
  6. Restart services: service portmap restart, service nfs restart
  7. Check status: rpcinfo -p localhost.
On the Client...
On the client, you shouldn't need to open up any ports. You can just add a line like:
remote.server.addr:/remote/share	 /local/mount	nfs	noatime
to /etc/fstab, and then use mount /local/mount to mount it. (I use /etc/fstab for laziness, of course.) SO LAZY.

Wednesday, August 10, 2011

disabling oradiag_user logging for oracle instant client

After getting the Oracle Instant Client and DBD::Oracle successfully installed, users noticed that they were getting enormous log files in oradiag_[username]/diag/clients/user_[username]/host_nnnnnn_nn , and that logging was apparently set to 16, which is full ridiculous logging. E.g. alert/log.xml had:

<msg time='2011-08-10T13:36:50.258-04:00' org_id='oracle' comp_id='clients'
type='UNKNOWN' level='16' host_id='hostname.domain.etc' host_addr='nnn.nnn.nnn.nnn'>
<txt>Directory does not exist for read/write [/usr/lib/oracle/11.2/client64/log] []
</txt>
</msg>


LAME. So lame. Following various advice, I added the following to $ORACLE_HOME/network/admin/sqlnet.ora :

TRACE_LEVEL_CLIENT = OFF
TRACE_DIRECTORY_CLIENT=/dev/null
LOG_DIRECTORY_CLIENT = /dev/null
LOG_FILE_CLIENT = /dev/null
LOG_LEVEL_CLIENT = OFF


... all apparently for nothing. However, I eventually found Oracle document 454927.1 (note: you need to be logged in to support.oracle.com for that link to work), which indicated I needed to also disable ADR (new as of client 11.2, fancy-pants xml-based logging system) using DIAG_ADR_ENABLED = OFF. THEN you are in the old logging mode, at which point your old settings to completely avoid OCI logging will work. So, sqlnet.ora should look like this:

DIAG_ADR_ENABLED = OFF
TRACE_LEVEL_CLIENT = OFF
TRACE_DIRECTORY_CLIENT=/dev/null
LOG_DIRECTORY_CLIENT = /dev/null
LOG_FILE_CLIENT = /dev/null
LOG_LEVEL_CLIENT = OFF


HOWEVER, you must make sure that users have TNS_ADMIN=/usr/lib/oracle/11.2/client64/network/admin/ (or wherever your sqlnet.ora file is). You can add the line "export TNS_ADMIN=/path/to/your/file/" to /etc/profile to set it for all your users by default. Users can override TNS_ADMIN in their ~/.bashrc file to whatever they want.

Note - why would you want to disable the Oracle Instant Client logging? Isn't logging inherently good and disabling it inherently evil?
- If your application (e.g. perl) is already catching Oracle errors.
- If your application is in production and making many requests, the I/O from the logging might slow it down to a measurable extent.

Other sources that helped me find this solution: stackoverflow.com, CERN Savannah Bugs #58917, Oracle Forums thread #959329 (only found via the CERN page - gah!).
... thanks guys!

Tuesday, August 9, 2011

install perl DBD::Oracle (Lesson learned: CPAN and yum don't mix)

From the many tales of woe on the web about installing perl DBD::Oracle, from "invalid lvalue in assignment" to mysterious make errors, and the pages of intricate instructions doubtfully translated from the French, I assumed that it was a long and difficult process and it was natural that I was having problems installing on 64-bit CentOS. WRONG! It can actually be easy for lazy and dumb people like me.

First, just use yum and hand-compiling. Don't add CPAN to the mix.
  1. Add the rpmforge repo and the EPEL repo (see links for instructions) so that you can install perl-DBD and perl-DBI via yum.
  2. Install perl-DBD and perl-DBI via yum.
  3. Download and install the OCI client "basic" and "sdk/devel" packages from Oracle. Note that you might need an older version if you're connecting to an older version of Oracle. Note also that Oracle makes you log in to download this. Note also that you need both the SDK and the Basic package. I recommend getting the rpms - install with a simple rpm -Uvh .
  4. Oracle puts the libraries in a wacky place, e.g. /usr/lib/oracle/11.2/client64/lib if you're using the 64-bit version of 11.2. Therefore, create a new file, e.g. oci.conf, in /etc/ld.so.conf.d/, with the library location in it, and then run (as root) ldconfig -v to add it.
  5. Download the DBD::Oracle source from cpan and extract it someplace. 
  6. Set some environment variables:
    export ORACLE_HOME=/usr/lib/oracle/11.2/client64
    export PATH=$PATH:$ORACLE_HOME/bin
    export LD_LIBRARY_PATH=$ORACLE_HOME/lib
  7. Run "perl Makefile.PL -V 11.2.0" in the directory where the OCI client was extracted. Change the version number to whatever the correct version is. This avoids the "I could not determine Oracle client version so I'll just default to version 8.0.0.0" issue.
  8. Run "make install".
  9. You should be done!

The latest versions of the Oracle Instant Client default to making incredibly verbose logs in the user's home directories - how to disable default oradiag_user instant client logging.

Wednesday, July 6, 2011

perl cpan "recursive dependency detected"

Using the CPAN shell for the first time on a new server, after install Bundle::CPAN" and then "reload cpan", on the next install I got:

Recursive dependency detected:
    Bundle::CPAN
 => ExtUtils::MakeMaker
 => M/MS/MSCHWERN/ExtUtils-MakeMaker-6.56.tar.gz
 => File::Spec
 => S/SM/SMUELLER/PathTools-3.33.tar.gz
 => File::Path
 => D/DL/DLAND/File-Path-2.08.tar.gz
 => File::Spec.
Cannot continue.

in the perl cpan installer. Alas! I thought. Will I have to install the old-fashioned way? Indeed no! This may just be a sign that you need to "install Bundle::CPAN" and then "reload cpan".

But I just did that, damn it! Apparently, "reload cpan" isn't always completely effective after "install Bundle::CPAN". If you encounter this error, try quitting cpan and reloading.

Proof yet again that almost all tech problems can be solved by restarting the (whatever).

Btw, if you're not using the CPAN shell, use perl -MCPAN -e 'install( q{Bundle::CPAN} )' to install/update the cpan bundle.

Monday, April 18, 2011

xen guest I/O issue: task blocked for more than 120 seconds

(Update to this post here, with more info for G7 blades.)

A lot of people on CentOS/RHEL who run Xen (and apparently also people on Debian/Ubuntu) who have HP or Fujitsu hardware RAIDs are experiencing this issue. Under sustained, heavy I/O on the guest/domU, the guest load average will rise (beyond 20 even on a single-CPU VM) and messages like the following will appear in syslog:

INFO: task syslogd:1500 blocked for more than 120 seconds. 
"echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message. 
syslogd D 0000000000000110 0 1500 1 1503 1491 (NOTLB) 
 ffff8800b0739d88 0000000000000286 ffff8800b8922970 ffff8800b8922970 
 0000000000000009 ffff8800bb2dd0c0 ffff8800baa55080 0000000000002b40 
 ffff8800bb2dd2a8 0000000000000000 
Call Trace: 
 [] :jbd:log_wait_commit+0xa3/0xf5 
 [] autoremove_wake_function+0x0/0x2e 
 [] :jbd:journal_stop+0x1cf/0x1ff 
 [] __writeback_single_inode+0x1d9/0x318 
 [] do_readv_writev+0x26e/0x291 
 [] sync_inode+0x24/0x33 
 [] :ext3:ext3_sync_file+0xcc/0xf8 
 [] do_fsync+0x52/0xa4 
 [] __do_fsync+0x23/0x36 
 [] tracesys+0xab/0xb6 
Two excellent bugs to start out with are:
http://bugs.centos.org/view.php?id=4515
https://bugzilla.redhat.com/show_bug.cgi?id=605444

This problem has apparently cropped up with all -194 kernels and persists into the -238 kernels. I find it continues even with CentOS 5.6 and 2.6.18-238.9.1.el5xen. I've found that it affects all versions of Xen from 3.0.3 to 3.4.3 to 4.1.

There is a firmware update for HP that you should definitely apply, but it doesn't appear to fix the problem completely. (Supposedly, the Fujitsu firmware update does fix the issue.) Particularly, some users got an aggravated version of the problem when a firmware issue caused the onboard battery not to charge. Other users were able to help the issue by disabling irqbalanced, though of course you may not want to do this if your vm uses multiple processors.

One extra source of ambiguity with these forum posts is that it's not clear what other people are doing to cause this issue. Post firmware upgrade, this issue can only reliably be triggered with sustained, heavy I/O. I used dd if=/dev/zero of=./test1024M bs=1024k count=1024 conv=fsync to ensure problems during testing.
- On a dom0, this test would complete in seconds with an average speed of 72-ish mb/second.
- On an affected domU, this test would cause the above-referenced load and dmesg messages, cause the entire system to be unresponsive, and have an average speed around 300 bytes/second.

The two things that made a material difference in our case, with HP Proliant BL460 G5s running CentOS 5.5 and 5.6 with CentOS guest vms, were, after the firmware update:
 - Switching the I/O scheduler on both the dom0 and the domU to noop
 - Capping the max raid speed on both the dom0 and the domU.

You can temporarily switch the I/O scheduler with something like:
echo noop > /sys/block/[your-block-device-name]/queue/schedule
... to change the default on reboot, add "elevator=noop" to the kernel line in /etc/grub.conf .

To set the max raid speed, you  can use
echo "50000" > /proc/sys/dev/raid/speed_limit_max
The default for raid/speed_limit_max is 200K; this sets it to 50K. I had expected that this would only matter on the dom0, since the domU doesn't technically have a raid, but putting it in both places eliminated some intermittent issues.

After changing the scheduler, I found that the domU 1GB copy from /dev/zero would average over 40MB/sec, but would still sometimes freak out, crawl at 2MB/sec, and fill up dmesg with complaints. The load would sometimes climb near 10. After setting the raid max speed, I found that the domU 1GB copy would always be between 47 and 49 MB/sec, the load would never climb above 1.5, and the VM stayed responsive. So it's a band-aid solution, but very effective, and should tide us over until there is a kernel or firmware solution.

*** *** ***

Update 20 June 2011:

Looking over the latest comments (see 28+) on https://bugzilla.redhat.com/show_bug.cgi?id=615543 , it seems that people running certain HP Smart Arrays may completely solve this issue by upgrading their driver AND firmware, though people on other models are not quite so lucky and still need to use the noop scheduler.

I also want to note that if you're in the middle of a slowdown due to this issue and can get any response at all, you'll probably notice that the amount of "Dirty" memory in /proc/meminfo is very high, and running "sync" can sometimes force your system out of the freeze a little quicker than it would find its way out on its own.
awk '/Dirty/ {printf( "%d\n", $2 )}' /proc/meminfo
will give you just the integer by itself, if you want a script to monitor this value.

Thursday, January 20, 2011

executing java from php in rhel/centos - not enough space for code cache

The apache user could run java programs just fine from the command line, but even the simplest java command in php would fail:
<?php
system('java -version');
?>

with the error 
Error occurred during initialization of VM
Could not reserve enough space for code cache

We already had the setsebool -P httpd_ssi_exec=1 option on, so that Apache had permission to execute whatever. Strangely, however, this "memory" issue is also due, somehow to selinux. Disabling selinux temporarily (e.g. with echo 0 >/selinux/enforce) or setting setsebool -P httpd_disable_trans=1 fixes the problem. I'd like less of a sledgehammer solution, but messing with other selinux settings does not seem to have any effect.


Edit: As a commenter below just indicated, you can use setsebool -P httpd_execmem=1 to disable this setting. If that doesn't work, well, back to the sledgehammer.