Private Note : My wife is being diagnosed as having stage III overian cancer. Please join and help if you can : http://www.facebook.com/group.php?gid=141171588041&ref=nf

Filer General

 

Messages at screen is configured at

/etc/syslog.conf.sample

By default, there is no such file, but if user modifies this file, they will have

/etc/syslog.conf  ----------- which will tell where to direct

                                           messages at screen ( typically

                                           /etc/messages )

 

Sysconfig –t    ( tape information )

Source –v /etc/rc    - this command reads and executes any file

                                 containing filer commands line by line

 

Auto support  ( user – trigger support )

Options.autosupport.doit autosupport@netapp.com

 

Telneting to Filer

Only one user can do telnet

Options telnet

 

Autosupport Configuration

Filer>Options autosupport

autosupport.mailhost < >

autosupport.support.to < autosupport@netapp.com >

autosupport.doit <string>

autosupport.support.transport   https    or  smtp

autosupport.support.url < url address must be reachable >

 

 

Autosupport troubleshooting

1. ping netapp.com from filer

2. TCP  443 SSL should be open at SMTP server

   SMTP server may stay in DMZ side

3. Mail relay in exchange must be specified. Filer’s host name or IP address must be specified in mail relay. Routing for netapp.com or routing by this host or routing by this ip must be enabled for filer. Filer is acting as a SMTP client. In general setup of mail system, no SMTP client is able to send the mail thru mail server to other SMTP server when host’s identity is different as far as mail id is concerned. Relaying is blocked generally.

4. Proxy server http / https must pass http url

 

Raid Scrub weekly

a. raid.scrub.duration 360

b raid.scrub.schedule sun@01

 

                a. scrub to happen for only 6 hrs

                b. forcing the scrub to start on Sunday at 1 am

 

RAID group

Vol add vol0 –g rg0 2  add 2 disks to raid group 0 of vol0

Vol options yervol raidsize 16    changes the raidsize settings

                                                     of the vol yervol to 16

vol create newvol –t raid_dp –r 16 32@36

                       - newvol creation with raid_dp protection.

                         RAID group size is 16disks. Since the vol

                         consists of 32 disks, those will form 2 RAID

                        group, rg0 & rg1

                        Max Raid groupsize

                              Raid DP   28

                              Raid 4      14

 

vol options for snapshots

nosnapdir off  < default off >

nosnap off < default off >

 

Disk Fail/unfail                     

priv set advanced                            when disk goes bad

disk fail                                           partially then prefail copy  

disk unfail                                       is seen when sysconfig -r

sysconfig –d                                   is done. Somestimes it may

Disk troubleshoot                           just hang there, so disk fail

Priv set advanced                           -i <disk name> would  

Led_on < 1d.16>                            release the disk &

Led_off < drive id >                       reconstruct the the RAID 

                                                       group

Blink_on  4.19 ( failed disk now will be orange )

Blink_off 4.19

 

Spare disks in vol

Vol status -s

 

FAS 270  ( this must be done, otherwise not seen )

Priv set advanced

Disk show –v  ----- to see who owned it. If this has come from

                                Another filer, disk block header needs to

                                Remove. For that

Disk unfail  <disk id>

Disk assign 0b.23

Fcadmin device_map

 

If drive not shown in filer view

Filer> storage show disk -p

 

Zeroing disks

Priv set advanced

Disk zero spares   ---  to zero out the data in all spares

Sysconfig –r    ( will show % of zero disk )  - spares disks

 

R100 & R150 Disk Swap

1. find bad disk , identify it

2. type disk swap < disk id >

3. Remove disk

4. Wait 20 sec

5. disk swap again

6.insert new disk

7. wait 20 sec to rescan

 

 

Out of inodes

 

Filer General              NFS                CIFS                SnapMirror                      SnapVault                NDMPCopy       SM SQL         Search

All Rights Reserved   Copyright @ 2007

 

1. Check % used of inodes by

    Filer> df –i

2. to increase

   Filer> maxfiles < vol name > <max>

 

NVRAM

Battery check

Filer> priv set diag

Filer> nv

 => should show battery status as OK and voltage as

       NVRAM3   6V

Raidtime out in options raid controlls ( 24 hr ) the trigger when bat low

In 940s – NVRAM5 is used as Cluster interconnect card as well, “two in one” on slot 11

 

Time Deamond

(port 123, 13, 37 must be open)

When there is large skew, lot of messages from

CfTimeDaemon : displacements /skews:10/3670,10/3670, 11/3670

Because of this hourly snapshot creation also fails or in progress message appears.

Because of timed.max_skew set to 30 min, we may see above message in every 30min- 1 hr

If we set this to 5s and see how skew happening – if we see lot of skew messages (once we turned ON to timed.log ON ), MB replacement may require.

For temporary do

Cf.timed.enable ON on both cluster filers and watch those off errors

Checking from unix host

# ntptrace –v filername

From filer check

Filer>options timed

      ( check all the options of this )

From filer view => set date and time : Synchronize now < ip of NTP server > => do synchronize now  and check NTP from unix host.

Tip : if there are multiple interfaces in filer, make sure that they are properly listed in NIS or DNS server – same host name , multiple ip addresses may require

 

 

BPS ( Blocks Per Sector ) of Disk

Block Append Checksum requires each disk drive to format it to 520 or 512 BPS per sector This provides a total of 4160 bytes in 8 sectors. This space is broken into two parts. First part is 4096 bytes ( 4K  - the WAFL file size ) of file system data. The remaining 64 bytes contain the checksum data for previous 4096 bytes. In this manner, the checksum block is appeneded to each block of data.

 

 

 

Enviromental Status

The top line in each channel says failures to yes , if there is any.

Subsequent messages should say

Power

Cooling

Temperature

Embedded switching     [ all to none ]

( if there is no problem )

 

Volume

Vol options vol0

Vol status vol0  -r  ( raid info of volume )

Sysconfig –r

Vol options vol0 raidsize 9

Vol add vol0 <number of disk >

Vol status –l   ---- to display all volume

 

Aggr Volume creation

Filer> create aggr1 10

Filer> vol create log1 aggr1 20g

 

When vol is gone bad

Vol wafliron start <vol name> -f

 

 

 

To list broken disk in volume

Vol status –f

Sysconfig –r will tell the failed disks

 

Double Parity

Vol create –t raiddp –r 2   ( minimum of two )

(there are two parity disks for holding parity and double parity data)

 

Enviroment status – like temp/shelf issues

Environment chesis list_sensors

Environment dump

 

 

RSH options   - rsh access of fier

Options rsh.enabled on

Adminhost  needed to add to do RSH ( can be done from filer

View )  - not root. RSH sec settings must be set with either ip or hostname, but with matching username for logon accounts ( not root, but the domain admin account )

RSH access from unix host

# rsh –l root <console p/w> <ip of filer> “<command>”

( add this unix host in /etc/hosts.equiv  file – similar for windows host as well )

  ( this command can be corned  in unix to make it scheduled )

RSH Port   514 / TCP

 

Registry Walk

 

Filer> registry walk status.vol.<vol name>