===== Real Application Cluster - Infiniband für den Interconnect ===== Konfiguration: * Infiniband Treiber einrichten * Oracle Rac mit UDP testen * Oracle Kernel auf RDS Umstellen ==== Kernel parameter überprüfen ==== ^Parameter^Value^ |net.ipv4.ip_local_port_ range| 1024 65000| |Net.core. rmem_default|262144| |Net.core. rmem_max|262144| |Net.core. wmem_default|262144| |Net.core. wmem_max|262144| ===== Was für eine Karte ist installiert $ lspci | grep Infini ... 47:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX IB DDR, PCIe 2.0GT/s] (rev a0) ... Treiber: \\ * http://www.mellanox.com/content/pages.php?pg=products_dyn&product_family=26&menu_section=34#tab-three ==== Infiniband Treiber einrichten ==== Download des Treiber unter der SilverStorm website.\\ === Installation === Oracle verwendet RDS als natives InfiniBand Protokol. \\ RDS benötigt IP over IB beide Protokolle müssen ausgewählt werden. === Konfiguration === Datei ipoib.cfg (vom Installer angelegt) prüfen. === Start / Stop === Prüfen ob RDS nach dem Boot läuft \\ Starten der RDS driver über "/etc/init.d/rds start"\\ \\ oder über:\\ # iba_start {rds | ics_init | ipoib…} # iba_stop {rds | ics_init | ipoib…} mit lsmod prüfen ob der Treiber im Kernel gebunden ist: lsmod | grep rds Infiniband Connect zwischen den Knoten überprüfen #1. Check the list of connect hosts [root@GPIrac1 ~]# ibnetdiscover -l Ca : 0x0002c99999999996 ports 2 devid 0x634a vendid 0x2c9 "GPIrac3 HCA-1" Ca : 0x0023788888888880 ports 2 devid 0x634a vendid 0x2c9 "GPIrac2 HCA-1" Ca : 0x002377777777778 ports 2 devid 0x634a vendid 0x2c9 "GPIrac1 HCA-1" Switch : 0x00227777777777a8 ports 32 devid 0xbd36 vendid 0x2c9 "Infiniscale-IV Mellanox Technologies" #2. Get GuID of connected Ports to ping [root@GPIrac1 ~]# ibnetdiscover -p | grep GPI | awk '{print $11 " -> " $17 }' | grep GPI 0x0002c9999999999 -> 'GPIrac3 0x0002c8888888888 -> 'GPIrac2 0x0002c7777777777 -> 'GPIrac1 #3. Start ping Server on the target hosts in a new ssh session [root@GPIrac3 ~]# ibping -S and [root@GPIrac2 ~]# ibping -S and [root@GPIrac1 ~]# ibping -S #4. Ping the target from all Hosts example for host 1 [root@GPIrac1 ~]# ibping -G 0x0002c9999999999 [root@GPIrac1 ~]# ibping -G 0x0002c8888888888 [root@GPIrac1 ~]# ibping -G 0x0002c7777777777 or ibping -G 0x0002c9999999999 -f -c 10 -f flood, -c 10 roundtrips ==== Oracle Rac mit UDP testen ==== Verwendung des Infiniband Devies als Interconnect sicherstellen: oifcfg getif –global oifcfg delif –global oifcfg setif –global ib1/192.168.1.0:cluster_interconnect ==== Oracle Kernel auf RDS Umstellen ==== Was wird akuelle für ein Protokoll für den Interconnect verwendet?\\ Siehe alert.log der aktuellen DB:\\ ... Cluster communication is configured to use the following interface(s) for this instance 10.10.10.118 cluster interconnect IPC version:Oracle UDP/IP (generic) ... Datenbank stoppen.\\ !! Im Cluster auch prüfen ob wirklich alle Instancen gestoppt sind!!! \\ sonst : ***ORA-27550: Target ID protocol check failed. tid vers=1, type=1, remote instance number=3, local instance number=1***\\ \\ Durch linken des Oracle Kernels mit der rds library RDS aktivieren. cd $ORACLE_HOME/rdbms/lib make -f ins_rdbms.mk ipc_rds [oracle@c7000rac1 lib]$ make -f ins_rdbms.mk ipc_rds # ... rm -f $ORACLE_HOME/lib/libskgxp11.so # ... cp $ORACLE_HOME/lib//libskgxpr.so /u01/app/oracle/product/11.2.0/dbhome_1/lib/libskgxp11.so ==== Oracle Linux 6.1 bzw. Scientific Linux (SL) Problematik ==== Installation der Mellanox Treiber schlägt fehl: \\ \\ Problem: ./mlnxofedinstall The 2.6.32-100.34.1.el6uek.x86_64 kernel is installed, but do not have drivers available. Cannot continue. **Lösung**:\\ Mellanox Treiber manuel installieren\\ (Sorry, in Englisch da für englischsprachigen Kunden .-) )\\ Ablauf: * Mount ISO Image of Mellanox Driver to /mnt/mellanox * Create directory /tmp/build_driver_mellanox * Copy from src directory from iso image the source code to /tmp/build_driver_ mellanox * Copy Conf File to /tmp/build_driver_ mellanox (from failed installation!) * Start Kernel driver installation ( ./install.pl –c ofed.conf) * Install Tools from rpm directory of iso cd Mount the Infiniband Setup ISO Image via the command # mkdir /mnt/mellanox # mount -o loop -t iso9660 /installfiles/06_mellanox/MLNX_OFED_LINUX-1.5.3-3.0.0-rhel5.7-x86_64.iso /mnt/mellanox Building RPMs for un-supported kernels. mkdir /tmp/build_driver_mellanox cd /mnt/mellanox/src cp MLNX_OFED_SRC-1.5.3-3.0.0.tgz /tmp/build_driver_mellanox cd /tmp/build_driver_mellanox tar zxvf MLNX_OFED_SRC-1.5.3-3.0.0.tgz cd MLNX_OFED_SRC-1.5.3-3.0.0 Copy ofed.conf to the /tmp/build_driver_mellanox/MLNX_OFED_SRC-1.5.3-3.0.0 directory\\ Deinstall package scsi-target-utils\\ yum remove scsi-target-utils Start the build and installation of the kernel module. ./install.pl -c ofed.conf Below is the list of OFED packages that you have chosen (some may have been added by the installer due to package dependencies): ofed-scripts kernel-ib kernel-ib-devel kernel-mft Uninstalling the previous version of OFED Build ofed-scripts RPM … …… ………… …… ….. Device (15b3:634a): 47:00.0 InfiniBand: Mellanox Technologies MT25418 [ConnectX VPI PCIe 2.0 2.5GT/s - IB DDR / 10GigE] (rev a0) Link Width: 8x PCI Link Speed: 2.5Gb/s Installation finished successfully. If you get the output "Installation finished successfully" the driver installation is finished. \\ Go back to iso image and install the Mellanox Support tools from RPM directory\\ cd /mnt/mellanox/RPMS/ yum --nogpgcheck install opensm-3.3.9.MLNX_20111006_e52d5fc-0.1.x86_64.rpm rds-tools-2.0.4-1.x86_64.rpm infiniband-diags-1.5.8.MLNX_20110906-0.1.x86_64.rpm ibutils2-2.0-0.34.g9d3133a.x86_64.rpm ibutils-1.5.7-0.1.g05a9d1a.x86_64.rpm opensm-libs-3.3.9.MLNX_20111006_e52d5fc-0.1.x86_64.rpm libibmad-1.3.7.MLNX_20110814-0.1.x86_64.rpm libibumad-1.3.7.MLNX_20110814-0.1.x86_64.rpm Please check in the configuration files for Infiniband, that the RDS driver is loaded at startup. $ cat /etc/infiniband/openib.conf | grep RDS # Load RDS module RDS_LOAD=yes If not, please change Value for parameter RDS_LOAD to yes on each node. If whole parameter is missing, add the parameter. See also this : http://www.hpcadvisorycouncil.com/events/2011/switzerland_workshop/pdf/Presentations/Day%201/2_InfiniBand_Training.pdf ==== Quellen === * http://www.qlogic.com/SiteCollectionDocuments/Solutions/HP/hp_silverstorm_10g_rac_rds_v8_public.pdf * http://www.servercare.com/assets/files/downloads/2008_121_RAC_11g_BEST_PRACTICES_AND_TUNING.doc * http://www.oracle.com/technetwork/database/features/availability/s281216-tsien-131087.pdf * http://www.openfabrics.org/archives/spring2010sonoma/Monday/8.30%20Tim%20Shetler%20Oracle/Sonoma_Workshop_2010_Oracle-final.pdf * http://www.hpcuserforum.com/presentations/April2009Roanoke/OFAOPENFABRICSSpring2009OFA.ppt * http://www.voltaire.com/Solutions/Database_Applications/oracle_10g_and_11g_real_application_clusters_rac * http://www.voltaire.com/download/VOLT-OracleSolutionGuide-092108.pdf * http://www.dell.com/downloads/global/power/ps2q07-20070279-Mahmood.pdf * http://www.unyoug.com/uploads/files/20090313/UNYOUG_20090313_Leveraging_Infiniband_v2.ppt * http://www.texmemsys.com/files/oracle_performance_tuning_with_ssd.pdf * http://www.oracle.com/global/de/events/2007/locals/germany/odd_hochverfuegbarkeit/ORACLE_ODD_IT_Betrieb_RAC_ASM_Solbach.pdf